Understanding DB Pooling for Sidekiq Workers

Can some please provide clarity on Sidekiq worker database pooling?

Currently, we’ve set DISCOURSE_DB_POOL to 15, with 2 Sidekiq workers each having a concurrency of 10. After reading the Sidekiq documentation, it appears that Sidekiq threads share the DB pool, hence I anticipate a maximum of 15 * 2 = 30 DB connections from Sidekiq workers. However, despite this understanding, we’re observing spikes in DB connections exceeding the expected maximum of 30.

image

These spikes occur approximately every 15 minutes and seem to be primarily associated with the PeriodicalUpdates scheduled job.

Could anyone kindly provide insights that will help us to avoid these spikes?

@david I noticed that you have done amazing work in this area. Can you please provide some insights?

I’m afraid I don’t know the details of how all this works off-hand.

It may be worth checking whether you’ve customized the UNICORN_SIDEKIQS env. That indicates how many sidekiq processes are launched (each of which will have many threads).

What exactly is your graph showing? Is it ‘concurrent connections’? Or is it ‘number of connections created’? If it’s the latter, perhaps some connections are being created & destroyed very quickly.

And one last question: what problem are you trying to solve here? Is there some issue being caused by the number of connections?

Thanks for getting back.

Following are the related configurations we have:

  • DISCOURSE_DB_POOL: “15”
  • DISCOURSE_SIDEKIQ_WORKERS: “10”
  • UNICORN_SIDEKIQS: “2”

This means we have 2 Sidekiq processes, each with 10 threads.

The graph shows the current number of client connections.

We are trying to configure the size of the connection pool (RDS proxy sitting in front of PostgreSQL) correctly. To do so, we need a better understanding of the connection pooling on the application side. Why is the application not respecting the DISCOURSE_DB_POOL config as expected?
If DISCOURSE_DB_POOL is working as expected, why do we see such big spikes in DB connections? These spikes seem to be aligned with the PeriodicalUpdates scheduled job runs.

Please let me know if you have more questions. Appreciate the help to uncover the mystery.

Do we need to have any config files in addition to DISCOURSE_DB_POOL environment variable to configure database pooling?

Are you 100% your proxy is not at fault here?