How to diagnose a slowdown?

My site has seen a slight and sudden slowdown when loading pages lately. I had an issue where a backup was generated and exceeded the space on my digital ocean volume and took down the site. Since then I have had a hard time rebuilding the site. These events could be related based on the timing. Currently the site appears to be in a stable state but just slower than what I’m used to.

I could get into the details of happened more but I’d rather ask a more general question. What are some techniques to diagnose the cause of a slowdown? My droplet is averaging 20% CPU utilization so I appear to have sufficient resources (4 GB Memory / 2 AMD vCPUs / 80 GB Disk, ~15k pageviews a day)

Any tips are appreciated

3 Likes

My first port of call would be
vmstat 5 5
on the command line.

3 Likes

Thanks for the suggestion. I’m not familiar with the command, is there something in particular I should be looking for?

 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0 475136 144304  20296 1786100    2    3  2622   447   44   50 19  3 72  1  4
 0  0 475136 143076  20304 1785312    0    0    65    25  622  584  2  1 95  0  1
 0  0 475136 141080  20456 1789144    0    0   800     3  459  473  2  1 96  0  1
 3  0 475136 143092  20572 1783408    0   17 11598    51  733  966 14  6 67  2 12
 0  0 475648 134688  20376 1791036    0   81 38915   394 1323 1784 10  8 61  8 13
1 Like

Thanks! If you had memory shortage, the cache numbers would be small, and if paging a lot, the si and so columns would be large. But this is not so.

We do see a big peak in bi and bo, which is typically disk activity. I wonder if something somewhere is building or repairing or scanning something.

Perhaps try running
ps auxrc
every five seconds for a minute or so, to see if you can catch a busy process in the act.

There are other utilities which might not already be installed: perhaps search for “How to Monitor Disk IO in a Linux System” or similar.

It’s worth noting that if you have doubts about the integrity of your system, rebuilding it from a backup might be the swiftest way forward. But be sure to have an offsite copy of the backup, if not two, in case of accident. And, ideally, do the install on a new instance and keep the existing one around until the new one is working OK.

5 Likes

This is great advice, thank you!

I’m sitting at 80% memory usage. Is this typical? The dip is because I stopped and restarted the app using the launcher.

droplet: 4 GB Memory / 2 AMD vCPUs / 80 GB Disk

I spun up a new droplet and put in a backup of the forum (without images) and I get similar behaviour

htop output, sorted by mem:

80% doesn’t sound problematic to me.

More interesting is that you have a lot of sidekiq processes and yet I see the annotation “0 of 5 busy” - you have more than 5. You also seem to have a lot of unicorn threads.

I suggest a new topic here, with your htop output, including your yml config as to whether you’ve adjusted your unicorn count. Ask whether this set of processes looks reasonable.

I have very similar htop on same VPS (different processor, though) and I haven’t change anything from default.

Without knowing anything I would claim that shows only one thing: there isn’t too much traffic, if any, at that point.

1 Like

Ah yes, I should have checked my own htop: very similar.

Another very different idea, for the original observation of ‘a slowdown’ - to activate the mini-profiler using Alt-P, then accessing a typical large page on your forum, and seeing what queries are being made and how long they take, by clicking on the timing figure which appears top right.

1 Like

I was able to do a apt upgrade and also rebuild. This problem: Pups error on rebuild 🐶 was preventing me from rebuilding for a while

Since the rebuild, it feels improved. I don’t like operating by feeling though in this case, I’d rather have analytics and measurable data. I appreciate the tips @Ed_S they will be useful for further monitoring.

I’m wondering if it’s possible to capture some of this profiling data to show the “health” of the instance via the admin page. Perhaps a potential plugin idea or future core feature?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.