This doesn’t look like a memory problem to me - there’s only around 6GB taken up by applications. There’s a lot of reading happening, which at a guess I’d say (along with sidekiq being busy) maybe postgres is doing a lot of reading of old data from disk, maybe to do some stats jobs. If the database doesn’t fit in RAM, it’s potentially going to do a lot of reading if that older data is being accessed. That’s what this smells like.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
admin 2510013 15.8 1.6 7383364 514764 ? SNl 12:55 56:45 sidekiq 7.3.9 discourse [3 of 5 busy]
systemd+ 2848301 26.9 25.3 8058104 7806820 ? Rs 17:45 18:18 postgres: 13/main: discourse discourse [local] SELECT
systemd+ 2870880 4.3 22.9 8058516 7072000 ? Ds 18:03 2:09 postgres: 13/main: discourse discourse [local] SELECT
systemd+ 2875058 21.0 25.4 8101952 7848140 ? Rs 18:07 9:46 postgres: 13/main: discourse discourse [local] UPDATE
systemd+ 2924006 30.2 25.3 8112844 7824292 ? Rs 18:44 2:44 postgres: 13/main: discourse discourse [local] UPDATE
jystemd+ 2929546 12.8 24.9 8049520 7694404 ? Ss 18:48 0:39 postgres: 13/main: discourse discourse [local] SELECT
systemd+ 2931802 17.7 24.5 8045744 7559492 ? Ss 18:50 0:36 postgres: 13/main: parallel worker for PID 946796
systemd+ 2931803 17.7 24.5 8045744 7557568 ? Ds 18:50 0:36 postgres: 13/main: parallel worker for PID 946796
I would look at sidekiq first - I bet it’s busy running updates. Check the /sidekiq url while logged in as admin to see what it’s doing. That’s probably going to show you some long-running update tasks.
You can also, on the SQL server:
SELECT pid, application_name, query
FROM pg_stat_activity
WHERE state IS NOT NULL
AND state != 'idle';
to see what queries are running and what’s calling them. That’ll hint towards what’s causing the load.
Do you have performance headers enabled? You can turn those on and capture them in logs to let you know how much time Discourse is spending on each part.