Issue: Extremely Slow Sidekiq Processing After Large Imports on Multisite Instance

We’re running several Discourse sites with multisite under a single app. Recently, we did a batch of large user imports (hundreds of thousands of users across 6 sites). After the imports, Sidekiq is processing background jobs very slowly. The Sidekiq dashboard shows a huge backlog, and jobs are clearing at a much slower rate than expected.

Environment details:

  • The VM was upgraded to 16 CPUs / 16GB RAM.
  • However, in the Sidekiq interface, we only see 5 threads and it seems like only a small portion of the resources are being used.
  • The main import queue (“nursingjobs” as multisite parent) is handling jobs for all the child sites, but job throughput is very low.
  • Server metrics: CPU sometimes at 80–90%, memory at around 6.7/7.2GB.

We’re looking to:

  • Speed up Sidekiq/background job processing to clear large backlogs post-import.
  • Ensure Discourse is making use of all the available resources (CPU/RAM).
  • Understand if there are thread/process limits that need adjustment.

Questions:

  1. What’s the best way to configure Sidekiq/Discourse for high-throughput post-import?
  2. What are the recommended settings for UNICORN_SIDEKIQS and DISCOURSE_SIDEKIQ_WORKERS on large multi-core systems?
  3. Are there Postgres or other app.yml settings we should tweak to avoid DB pool errors when raising Sidekiq concurrency?
  4. Any best practices for clearing huge Sidekiq backlogs quickly and safely after imports?

Sidekiq stats/screenshots available if helpful!

The answer to all of those questions is, more or less, cranking up DISCOURSE_SIDEKIQ_WORKERS.

I would crank that up to maybe 32 since you know you have a lot of spare CPU available. If you still have lots of CPU available after that’s been running for a while, feel free to crank it up more.

You could probably drop that back down to, say, 8 or 12 for normal operation.

Make sure you have enough max_connections for postgres. You’ve probably already bumped it up since you’re running multisite, but keep an eye on it.

2 Likes

Thanks @supermathie it’s working now.
I updated config to below

  UNICORN_WORKERS: 8
  UNICORN_SIDEKIQS: 7
  DISCOURSE_SIDEKIQ_WORKERS: 10
  DISCOURSE_DB_POOL: 20

And increased CPU to

8vCPU
16GB Memory
1 Like