Running Sidekiq more efficiently and reliably

(Sam Saffron) #1

I just checked in some work that:

  • Adds back a heartbeat check that ensures Sidekiq (and our scheduler) are running correctly
  • Allows us to fork Sidekiq from the Unicorn master, reducing memory usage.
  • Introduces monitoring of Sidekiq child processes
  • Upgrades Sidekiq to version 3.0 (which allows us to remove a hack we had after forking and adds the dead queue)

Recently, I have seen a couple of occurrences of stalled Sidekiq stalled. The process kept running, but stopped processing any jobs. This is severely problematic for our setups as many things on the site stop working if our job queue is not functioning properly.

To get all of this going I started by finalizing my demon manager, in particular one painful issue I had was this bug in Ruby.

In effect this forced me to monkey patch unicorn, cause I needed to run stuff from the master thread.

Once my demon manager was all working I handled the upgrade to Sidekiq, sidekiq 3 was an easy upgrade. The one caveat was ensuring we clean up our redis pool after fork. To take care of this we have Discourse.after_fork

 def self.after_fork
    current_db = RailsMultisite::ConnectionManagement.current_db
    RailsMultisite::ConnectionManagement.establish_connection(db: current_db)
    # shuts down all connections in the pool (probably can be skipped) 
    Sidekiq.redis_pool.shutdown{|c| nil}
    # re-establish
    Sidekiq.redis = sidekiq_redis_config

The demon manager takes care of monitoring forked processes, it has a suicide thread that makes sure it shuts itself down if you kill the parent process. Effectively this means that you can kill -9 your parent and still have the child sidekiq go away. Similarly the master process will monitor the child process, if it goes away it will respawn.

On top of this once an hour we check that our heartbeat job ran. If it has not run we will restart sidekiq. The heartbeat job runs in our Scheduler and spawns a sidekiq job that updates redis. So you get 2 checks in one.

I am now running the new system on meta and my blog, I plan to amend the Docker template to take advantage of this new more robust system soon.

If you are super eager to play with this, set the UNICORN_SIDEKIQS env var to 1 AND nuke /etc/service/sidekiq. Once this is battle tested for a few days I will fix up the base templates and avoid creating the sidekiq service.

Besides the increased reliability we now have a modest memory saving:

Prior to forking

179MB [PSS] actual memory impact

Post forking

107MB [PSS]

Savings: 71MB (39%)

I used smem for these calculations. Our savings are more pronounced cause we use jemalloc.

Total PSS for my running version of Discourse is 670MB, free is reporting 1.1GB free on my 2GB instance.

Medium term I plan to amend docker manager so it winds the box down far more than it does today before running upgrades (it can easily stop 2 unicorns and the sidekiq before messing with asset precompilation and such on low RAM setups)

I also plan to experiment with an all in one puma based setup for ultra low memory conditions (clearly performance will suffer a bit since no out of band GC and GIL)

I hope you enjoy the new robustness and reduced memory use.

I am totally open to extracting this sub-system out of Discourse into some sort of gem if someone wants to work on this, but be warned the scheduler needs to be extracted first.

(Jeff Atwood) #2

All of this is gold! How much less memory are we talking here? This will help the 1 GB install be even more viable.

(Sam Saffron) #3

Sorry, formatting bug, corrected.

Biggest help will be winding down memory use prior to upgrade. That is the biggest pain point. Saving 7% of total memory on your box is a decent saving (when overall increasing system reliability anyway)

(Erick Guan) #4

I definitely vote for these two gems. Learned a lot from it.

(Michael) #5

Sorry to interrupt here.
But can you link to a howto how to set up discourse (preferably without docker) to use unicorn that forks sidekiq or give me a hint how to do that? Currently I’m using thins and as far as i understand, the sidekiq gets not forked out there? Thin uses 162megs/each and sidekiq 142. I’d love to reduce that, (mainly just for the fun of using a very optimized setup :smile:)