Expected 99th percentile performance

I have migrated a high-volume site from Hosted to self-hosted for someone else to manage. I would like to provide them with as much information as possible to have them be successful in managing the site. This person is convinced that his hardware is more than sufficient to the task, but the page performance seems to suggest otherwise. I’m not sure whether there is more tuning that I should do or whether the hardware is to blame.

Here are the stats from Prometheus/Grafana:

Is there something here that the new manager should be aware of?

Also, what do I do to get Prometheus to feed the container CPU and memory usage to Grafana?

Edit: Maybe this time wasn’t representative. Median performance looks like 200-400ms and 99th percentile is 2-3 seconds.

1 Like

Looking at the graphs it looks like Postgres is choking.

On our hosted installs very busy sites will sit on 99th percentile below 400ms on a bad day.

5 Likes

Thanks, Sam.

The database is 30GB on disk. There is reason to believe that the disk is slow. This is with db_work_mem=90MB and 10GB for db_shared_buffers. There is 24GB of system RAM. My next move is to bump db_shared_buffers to 16GB. Does that make sense? Should I bump db_work_mem too?

Here’s the past 6 hours:

Today so far there have been spikes in the 5, 10, and 25 second ranges.

Edit: the site owner is happy with the performance. I remain interested in the answer to the question, but it looks like I’ll leave things as they are.

4 Likes

Hi. Did you get Prometheus to feed the container CPU and memory usage to Grafana already? If yes, would it be alright to tell the configuration? Thank you.