Hi all,
I’ve got a problem I’d like to ask your help with. I’ve got a server installed with Discourse using the Docker-based installer. Every day around noon the server for an as of yet unknown reason to me becomes busy and I see IO load rise up to the point where the forum starts to show error 500’s.
iotop always tells me it’s an PostgreSQL UPDATE that eats up all IO.
So today when it happened again I grabbed a list of all running SQL queries with this command:
sudo -u postgres psql discourse -o /tmp/RunningQueries -c ‘SELECT datname,pid,state,query FROM pg_stat_activity’
The output of this command can be found here on Pastebin.
As you can see, there are at these times around 32 UPDATE queries running. When this happens, iotop tells me the database reads and writes anywhere from 2.5MB/sec up to 15MB/sec.
If I correlate the running UPDATE query with the logged query (in /var/discourse/shared/standalone/log/var-log/postgresql) I do see that those queries take very very long:
2018-07-03 12:51:27.052 UTC [17504] discourse@discourse LOG: duration: 2352061.872 ms statement: UPDATE drafts
SET data = '{"reply":"<redacted for debugging purposes>","action":"reply","categoryId":24,"postId":118034,"archetypeId":"regular","whisper":false,"metaData":null,"composerTime":65922,"typingTime":8400}',
correlates with
discourse | 17504 | active | UPDATE drafts +
| | | SET data = '{"reply":"<redacted for debugging purposes>","action":"reply","categoryId":24,"postId":118034,"archetypeId":"regular","whisper":false,"metaData":null,"composerTime":65922,"typingTime":8400}', +
| | | sequence = 124, +
| | | revisions = revisions + 1 +
| | | WHERE id = 84548
If I restart the docker app, the queries do return so basically all I can do is wait it out which makes my users unhappy.
Please let me know if there is anything I can do to make this less bad - for instance move maintenance tasks to somewhere around 5AM or so.
If you have any more suggestions for me to dig deeper, please share! Any help is appreciated
Ninja-edit: Just remembered some additional information: I couldn’t relate this to existing cronjobs (neither from the host-OS nor from within the docker app).
Some background information on the server:
- Server OS is Ubuntu 18.04LTS
- It’s a VM with 100GB disk, 4GB ram, 4 cores
- Disk is a 6-disk RAID10 with 15K disks
- Discourse v2.1.0.beta2 +107
- Plugins: babble, whos-online, voting, cakeday, anonymous-categories, league
- Pageviews per day: Around 75K
- Users: 1350, around 700/day active
- 73.6K posts, 351 topics, most active topic contains 13.5K posts
Thanks for reading this far!