I’ve got a 2-container install on a DO 8GB droplet that is behaving very strangely.
There is a postmaster (EDIT: now there are two of them) processing eating 100% CPU.
Sidekiq is running, but the Dashboard complains that it’s not checking for updates.
There are some logs like
PG::ConnectionBad (FATAL: remaining connection slots are reserved for non-replication superuser connections ) /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/pg-0.21.0/lib/pg.rb:56:in `initialize'
and
Job exception: FATAL: remaining connection slots are reserved for non-replication superuser connections
The data container has:
db_shared_buffers: "2GB"
db_work_mem: "40MB"
There are 4 unicorn workers in the web container (same as # processors).
The postgresql connection limit needs to be increased. That will cause the database as a whole to use more memory, but based on the free output you’ve got plenty that could be used if required. I’d double the current value, and review errors and resource consumption.
I did that, but I’m still getting a 502 error on the admin dashboard.
The other issue is that this site is using cloudflare with no caching (I’m told). I have included the cloudflare template, but I still suspect something is wrong with cloudflare.
It’s the max_connections parameter in postgresql.conf. I don’t see a tunable for that in discourse_docker, so I suspect you’ll need to play games with a pups exec stanza to make the edit.
As for Cloudflare, all the cloudflare template does it make it so that IP addresses get fixed after going through Cloudflare proxying. It doesn’t do anything to make Cloudflare cache. You might want to keep that in a separate topic, rather than mix them together in here.
Not one for playing games when they’re not necessary, I went into the data container, edited postgresql.conf by hand, doubled max_connections (from 100 to 200) and, LO! it seems that all is well.
I don’t understand just why I’ve not encountered this before or why this is the solution here. The database doesn’t seem that big and the traffic doesn’t seem that high.
Edit: I have played the games and won!
If anyone else cares. . . stick this in data.yml in hooks in the after_postgres section. I put it after the -exec section.
# double max_connections to 200
- replace:
filename: "/etc/postgresql/9.5/main/postgresql.conf"
from: /#?max_connections *=.*/
to: "max_connections = 200"
@pfaffman Did this solve the postmasters gone wild high CPU usage issue for you?
I modified max connections directly in postgresql.conf (/var/discourse/shared/standalone/postgres_data/postgresql.conf) and used ./launcher rebuild app. Haven’t noticed a difference though.
I tried giving postgres more memory and less. Adding swap seemed to have helped (hence trying giving pg less memory) . One thing that I did that might have helped was to backup and restore the database. Or it could be that it did nothing.
I don’t have a silver bullet, but those are the things that I did.