char
January 29, 2026, 1:27pm
1
Hi everyone, we’re trying to troubleshoot the issues our Discourse installation has been having in the last few days.
We’re running a Contabo cloud vps, 8 cores/32gb/nvme, for a userbase of ~150 people; we’ve had few issues in the last 2 and a half years.
Since last weekend, the instance is having moments of almost-unusability, due to a un unusually high CPU usage.
To troubleshoot this situation, we’ve let the forum in Read Only mode yesterday. I have a few Grafana graphs showing what’s our situation - the three markers indicate where we turned on Read Only, when we restarted the container and when we turned off Read Only.
(a couple more graphs here:
Imgur: The magic of the Internet )
As you can see, usage is pretty high. This started a few days ago, therefore there’s some kind of issue we’ve yet to troubleshoot properly.
The unusual thing we noticed is that our host reboots the vps during nighttime on sunday, and last weekend the database didn’t manage to complete one or more transactions.
We think this might have created inconsistencies in some kind of internal Discourse process, and this inconsistency is fighting against user activity - but we’re simply hypothesizing, here.
After asking the AI assistant before opening a new thread, I can add that Sidekiq has some outlier processes:
Jobs::ProcessBadgeBacklog takes about 2-5 seconds
last DestroyOldDeletionStubs took 475 seconds.
last DirectoryRefreshDaily took 580 seconds.
last TopRefreshToday took 18 seconds.
So, the question is - what could be causing this kind of situation, with the userbase and the hardware we’re using?
Is there anything more specific we should look into?
I think our userbase should not warrant emergency situations, but I’m not married to any opinion we’ve had so far, and we’d be very helpful for pointers to what else we could look into.
Thanks!
3 Likes
Ed_S
(Ed S)
January 29, 2026, 4:52pm
2
I think somehow you’re short of RAM. You have 32G. But possibly you have some process or processes using more than expected.
Perhaps capture ps aux in a window about 120 chars wide. Here’s mine:
root@rc-debian-hel:~# ps uax
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.2 167752 9096 ? Ss 2025 4:24 /sbin/init
root 2 0.0 0.0 0 0 ? S 2025 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? I< 2025 0:00 [rcu_gp]
root 4 0.0 0.0 0 0 ? I< 2025 0:00 [rcu_par_gp]
root 5 0.0 0.0 0 0 ? I< 2025 0:00 [slub_flushwq]
root 6 0.0 0.0 0 0 ? I< 2025 0:00 [netns]
root 8 0.0 0.0 0 0 ? I< 2025 0:00 [kworker/0:0H-events_highpri]
root 10 0.0 0.0 0 0 ? I< 2025 0:00 [mm_percpu_wq]
root 11 0.0 0.0 0 0 ? S 2025 0:00 [rcu_tasks_rude_]
root 12 0.0 0.0 0 0 ? S 2025 0:00 [rcu_tasks_trace]
root 13 0.0 0.0 0 0 ? S 2025 4:48 [ksoftirqd/0]
root 14 0.0 0.0 0 0 ? I 2025 45:30 [rcu_sched]
root 15 0.0 0.0 0 0 ? S 2025 0:20 [migration/0]
root 16 0.0 0.0 0 0 ? S 2025 0:00 [idle_inject/0]
root 18 0.0 0.0 0 0 ? S 2025 0:00 [cpuhp/0]
root 19 0.0 0.0 0 0 ? S 2025 0:00 [cpuhp/1]
root 20 0.0 0.0 0 0 ? S 2025 0:00 [idle_inject/1]
root 21 0.0 0.0 0 0 ? S 2025 0:21 [migration/1]
root 22 0.0 0.0 0 0 ? S 2025 4:35 [ksoftirqd/1]
root 24 0.0 0.0 0 0 ? I< 2025 0:00 [kworker/1:0H-events_highpri]
root 25 0.0 0.0 0 0 ? S 2025 0:00 [kdevtmpfs]
root 26 0.0 0.0 0 0 ? I< 2025 0:00 [inet_frag_wq]
root 28 0.0 0.0 0 0 ? S 2025 0:00 [kauditd]
root 29 0.0 0.0 0 0 ? S 2025 0:02 [khungtaskd]
root 30 0.0 0.0 0 0 ? S 2025 0:00 [oom_reaper]
root 31 0.0 0.0 0 0 ? I< 2025 0:00 [writeback]
root 32 0.0 0.0 0 0 ? S 2025 8:40 [kcompactd0]
root 33 0.0 0.0 0 0 ? SN 2025 0:00 [ksmd]
root 34 0.0 0.0 0 0 ? SN 2025 0:00 [khugepaged]
root 80 0.0 0.0 0 0 ? I< 2025 0:00 [kintegrityd]
root 81 0.0 0.0 0 0 ? I< 2025 0:00 [kblockd]
root 82 0.0 0.0 0 0 ? I< 2025 0:00 [blkcg_punt_bio]
root 83 0.0 0.0 0 0 ? I< 2025 0:00 [tpm_dev_wq]
root 84 0.0 0.0 0 0 ? I< 2025 0:00 [ata_sff]
root 85 0.0 0.0 0 0 ? I< 2025 0:00 [md]
root 86 0.0 0.0 0 0 ? I< 2025 0:00 [edac-poller]
root 87 0.0 0.0 0 0 ? I< 2025 0:00 [devfreq_wq]
root 88 0.0 0.0 0 0 ? S 2025 0:00 [watchdogd]
root 90 0.0 0.0 0 0 ? I< 2025 1:23 [kworker/0:1H-kblockd]
root 91 0.0 0.0 0 0 ? S 2025 2:15 [kswapd0]
root 92 0.0 0.0 0 0 ? S 2025 0:00 [ecryptfs-kthrea]
root 94 0.0 0.0 0 0 ? I< 2025 0:00 [kthrotld]
root 95 0.0 0.0 0 0 ? S 2025 0:00 [irq/51-aerdrv]
root 96 0.0 0.0 0 0 ? S 2025 0:00 [irq/51-pciehp]
root 97 0.0 0.0 0 0 ? S 2025 0:00 [irq/52-aerdrv]
root 98 0.0 0.0 0 0 ? S 2025 0:00 [irq/52-pciehp]
root 99 0.0 0.0 0 0 ? S 2025 0:00 [irq/53-aerdrv]
root 100 0.0 0.0 0 0 ? S 2025 0:00 [irq/53-pciehp]
root 101 0.0 0.0 0 0 ? S 2025 0:00 [irq/54-aerdrv]
root 102 0.0 0.0 0 0 ? S 2025 0:00 [irq/54-pciehp]
root 103 0.0 0.0 0 0 ? S 2025 0:00 [irq/55-aerdrv]
root 104 0.0 0.0 0 0 ? S 2025 0:00 [irq/55-pciehp]
root 105 0.0 0.0 0 0 ? S 2025 0:00 [irq/56-aerdrv]
root 106 0.0 0.0 0 0 ? S 2025 0:00 [irq/56-pciehp]
root 107 0.0 0.0 0 0 ? S 2025 0:00 [irq/57-aerdrv]
root 108 0.0 0.0 0 0 ? S 2025 0:00 [irq/57-pciehp]
root 109 0.0 0.0 0 0 ? S 2025 0:00 [irq/58-aerdrv]
root 110 0.0 0.0 0 0 ? S 2025 0:00 [irq/58-pciehp]
root 111 0.0 0.0 0 0 ? S 2025 0:00 [irq/59-aerdrv]
root 112 0.0 0.0 0 0 ? S 2025 0:00 [irq/59-pciehp]
root 113 0.0 0.0 0 0 ? S 2025 0:00 [irq/49-ACPI:Ged]
root 114 0.0 0.0 0 0 ? I< 2025 0:00 [acpi_thermal_pm]
root 116 0.0 0.0 0 0 ? I< 2025 0:00 [mld]
root 117 0.0 0.0 0 0 ? I< 2025 0:00 [ipv6_addrconf]
root 126 0.0 0.0 0 0 ? I< 2025 0:00 [kstrp]
root 129 0.0 0.0 0 0 ? I< 2025 0:00 [zswap-shrink]
root 130 0.0 0.0 0 0 ? I< 2025 0:00 [kworker/u5:0]
root 134 0.0 0.0 0 0 ? I< 2025 0:00 [cryptd]
root 173 0.0 0.0 0 0 ? I< 2025 0:00 [charger_manager]
root 197 0.0 0.0 0 0 ? I< 2025 1:21 [kworker/1:1H-kblockd]
root 209 0.0 0.0 0 0 ? S 2025 0:03 [hwrng]
root 210 0.0 0.0 0 0 ? S 2025 0:00 [scsi_eh_0]
root 221 0.0 0.0 0 0 ? I< 2025 0:00 [scsi_tmf_0]
root 300 0.0 0.0 0 0 ? I< 2025 0:00 [raid5wq]
root 346 0.0 0.0 0 0 ? S 2025 2:47 [jbd2/sda1-8]
root 347 0.0 0.0 0 0 ? I< 2025 0:00 [ext4-rsv-conver]
root 414 0.0 0.2 67632 8048 ? S<s 2025 7:05 /lib/systemd/systemd-journald
root 448 0.0 0.0 0 0 ? I< 2025 0:00 [kaluad]
root 454 0.0 0.0 0 0 ? I< 2025 0:00 [kmpath_rdacd]
root 455 0.0 0.0 0 0 ? I< 2025 0:00 [kmpathd]
root 456 0.0 0.0 0 0 ? I< 2025 0:00 [kmpath_handlerd]
root 457 0.0 0.6 289888 25700 ? SLsl 2025 8:11 /sbin/multipathd -d -s
root 459 0.0 0.0 10720 2760 ? Ss 2025 0:06 /lib/systemd/systemd-udevd
systemd+ 610 0.0 0.0 88712 1724 ? Ssl 2025 0:07 /lib/systemd/systemd-timesyncd
systemd+ 628 0.0 0.1 16456 4992 ? Ss 2025 0:39 /lib/systemd/systemd-networkd
systemd+ 630 0.0 0.0 26076 3632 ? Ss 2025 0:07 /lib/systemd/systemd-resolved
root 671 0.0 0.0 82124 2620 ? Ssl 2025 6:08 /usr/sbin/irqbalance --foreground
root 677 0.0 0.0 6540 2040 ? Ss 2025 0:08 /usr/sbin/cron -f -P
root 678 0.0 0.0 79464 960 ? Ssl 2025 64:36 /usr/sbin/qemu-ga
syslog 679 0.0 0.1 222044 4080 ? Ssl 2025 1:28 /usr/sbin/rsyslogd -n -iNONE
root 683 0.0 0.1 109620 7304 ? Ssl 2025 20:29 /usr/bin/python3 /usr/share/unattended-upgrades/unattended
root 692 0.1 0.4 1861484 17464 ? Ssl 2025 100:02 /usr/bin/containerd
daemon 698 0.0 0.0 3512 1024 ? Ss 2025 0:00 /usr/sbin/atd -f
root 708 0.0 0.0 5236 436 ttyAMA0 Ss+ 2025 0:00 /sbin/agetty -o -p -- \u --keep-baud 115200,57600,38400,96
root 709 0.0 0.0 5236 432 ttyS0 Ss+ 2025 0:00 /sbin/agetty -o -p -- \u --keep-baud 115200,57600,38400,96
root 712 0.0 0.0 15196 3744 ? Ss 2025 6:43 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
root 713 0.0 0.0 5612 664 tty1 Ss+ 2025 0:00 /sbin/agetty -o -p -- \u --noclear tty1 linux
root 732 0.0 0.7 2488860 30672 ? Ssl 2025 16:37 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/con
systemd+ 955250 0.0 0.5 908900 19820 ? Ss Jan11 0:00 postgres: 15/main: discourse discourse [local] idle
root 978311 0.0 0.0 7064 1608 ? Ss+ Jan11 0:00 bash
root 2844401 0.0 0.0 0 0 ? I Jan28 0:58 [kworker/0:2-events]
root 2919926 0.0 0.0 0 0 ? I 06:00 0:18 [kworker/1:0-events]
root 2929247 0.0 0.0 0 0 ? I 08:03 0:00 [kworker/1:2-events]
root 2947488 0.0 0.0 0 0 ? I 11:58 0:00 [kworker/0:3-events]
root 2958380 0.0 0.0 0 0 ? I 14:18 0:00 [kworker/u4:2-flush-8:0]
systemd+ 2960448 0.2 5.7 928984 224576 ? Ss 14:47 0:16 postgres: 15/main: discourse discourse [local] idle
systemd+ 2966096 0.2 4.1 923428 160800 ? Ss 16:03 0:05 postgres: 15/main: discourse discourse [local] idle
root 2966159 0.0 0.0 0 0 ? I 16:04 0:00 [kworker/u4:3-events_unbound]
root 2966695 0.0 0.0 0 0 ? I 16:11 0:00 [kworker/u4:0-events_unbound]
root 2967455 0.0 0.2 18476 9584 ? Ss 16:21 0:00 sshd: root@pts/0
root 2967537 0.0 0.1 8300 4748 pts/0 Ss 16:21 0:00 -bash
systemd+ 2968782 0.0 3.0 916952 120500 ? Ss 16:35 0:00 postgres: 15/main: discourse discourse [local] idle
systemd+ 2969962 0.0 0.6 908928 23584 ? Ss 16:50 0:00 postgres: 15/main: discourse discourse [local] idle
1000 2969995 0.0 0.0 15928 2824 ? S 16:50 0:00 sleep 1
root 2969996 0.0 0.0 10412 2992 pts/0 R+ 16:50 0:00 ps uax
root 4019702 0.0 0.1 1237968 7740 ? Sl Jan01 4:06 /usr/bin/containerd-shim-runc-v2 -namespace moby -id 6b624
root 4019724 0.0 0.0 6800 324 pts/0 Ss+ Jan01 0:00 /bin/bash /sbin/boot
root 4019750 0.0 0.0 1597212 344 ? Sl Jan01 0:02 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-po
root 4019755 0.0 0.0 1671004 0 ? Sl Jan01 0:02 /usr/bin/docker-proxy -proto tcp -host-ip :: -host-port 80
root 4019763 0.0 0.0 1671004 1580 ? Sl Jan01 0:02 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-po
root 4019769 0.0 0.0 1744796 0 ? Sl Jan01 0:02 /usr/bin/docker-proxy -proto tcp -host-ip :: -host-port 44
root 4023771 0.0 0.0 2236 20 pts/0 S+ Jan01 0:35 /usr/bin/runsvdir -P /etc/service
root 4023772 0.0 0.0 2084 0 ? Ss Jan01 0:00 runsv cron
root 4023773 0.0 0.0 2084 0 ? Ss Jan01 0:00 runsv rsyslog
root 4023774 0.0 0.0 2084 8 ? Ss Jan01 0:00 runsv unicorn
root 4023775 0.0 0.0 2084 28 ? Ss Jan01 0:00 runsv nginx
root 4023776 0.0 0.0 2084 0 ? Ss Jan01 0:00 runsv postgres
root 4023777 0.0 0.0 2084 0 ? Ss Jan01 0:00 runsv redis
root 4023778 0.0 0.0 6692 868 ? S Jan01 0:06 cron -f
root 4023779 0.0 0.0 2228 792 ? S Jan01 0:01 svlogd /var/log/postgres
root 4023780 0.0 0.0 54152 2276 ? S Jan01 0:00 nginx: master process /usr/sbin/nginx
systemd+ 4023781 0.0 0.4 905940 19092 ? S Jan01 2:13 /usr/lib/postgresql/15/bin/postmaster -D /etc/postgresql/1
root 4023782 0.0 0.0 152356 208 ? Sl Jan01 0:02 rsyslogd -n
root 4023783 0.0 0.0 2228 836 ? S Jan01 0:02 svlogd /var/log/redis
1000 4023784 0.0 0.0 20592 1664 ? S Jan01 22:38 /bin/bash ./config/unicorn_launcher -E production -c confi
message+ 4023785 0.4 0.6 102368 26700 ? Sl Jan01 165:39 /usr/bin/redis-server *:6379
www-data 4023796 0.1 3.1 188008 124596 ? S Jan01 75:59 nginx: worker process
www-data 4023797 0.1 1.2 98528 49972 ? S Jan01 76:53 nginx: worker process
www-data 4023798 0.0 0.0 54352 1088 ? S Jan01 0:15 nginx: cache manager process
systemd+ 4023807 0.0 8.3 906192 327344 ? Ss Jan01 3:19 postgres: 15/main: checkpointer
systemd+ 4023808 0.0 0.6 906088 24076 ? Ss Jan01 0:24 postgres: 15/main: background writer
systemd+ 4023810 0.0 0.4 905940 18260 ? Ss Jan01 9:48 postgres: 15/main: walwriter
systemd+ 4023811 0.0 0.0 907536 2312 ? Ss Jan01 0:29 postgres: 15/main: autovacuum launcher
systemd+ 4023812 0.0 0.0 907512 2456 ? Ss Jan01 0:01 postgres: 15/main: logical replication launcher
1000 4023813 0.0 3.8 1540732 148552 ? Sl Jan01 7:41 unicorn master -E production -c config/unicorn.conf.rb
systemd+ 4023881 0.0 0.4 919884 16692 ? Ss Jan01 0:03 postgres: 15/main: discourse discourse [local] idle
1000 4024290 1.0 9.4 7103052 368788 ? SNl Jan01 410:25 sidekiq 7.3.9 discourse [0 of 5 busy]
1000 4024313 1.8 10.3 6999048 404032 ? Sl Jan01 728:22 unicorn worker[0] -E production -c config/unicorn.conf.rb
1000 4024339 0.0 9.0 6931980 354124 ? Sl Jan01 37:50 unicorn worker[1] -E production -c config/unicorn.conf.rb
1000 4024397 0.0 7.9 6921672 309392 ? Sl Jan01 14:27 unicorn worker[2] -E production -c config/unicorn.conf.rb
1000 4024478 0.0 6.5 6936200 255776 ? Sl Jan01 12:53 unicorn worker[3] -E production -c config/unicorn.conf.rb
systemd+ 4025084 0.0 1.0 911596 41712 ? Ss Jan01 0:05 postgres: 15/main: discourse discourse [local] idle
systemd+ 4035965 0.0 0.9 908812 35216 ? Ss Jan01 0:38 postgres: 15/main: discourse discourse [local] idle
systemd+ 4044886 0.0 0.8 908812 34968 ? Ss Jan01 0:39 postgres: 15/main: discourse discourse [local] idle
I don’t believe there’s any sensitive data there. If you see something, edit it out before posting!
1 Like
Ed_S
(Ed S)
January 29, 2026, 4:56pm
3
I note in the busy times a lot of system kernel time which often means paging. You can always run
vmstat 5 5
to get a snapshot of how the virtual memory system is coping.
1 Like
char
January 29, 2026, 5:02pm
4
Thanks, we’ll look into this suggestion.