How to periodically check if sidekiq is paused and unpause in case?


(Florian Schmaus) #1

We are running into Sidekiq is being paused, how can I discover why? with our discourse ‘stable’ (2.1.1) installation. It appears that it will take a while until the fix hits discoure’s stable branch. Hence what would be the preferred way to set up a periodic monitoring of sidekiq and unpause the scheduler if it is paused? I am thinking about a cron job that does

./launcher enter app
rails c
Sidekiq.paused?
Sidekiq.unpause!

every 15 minutes or so.


(Jay Pfaffman) #2

That’s treating the symptom rather than finding a cure. Do you have enough ram?


(Gerhard Schlager) #3

So, why is Sidekiq being paused for you? Are you using S3 for backups? If so, take a look at S3 Backup ... suspect access issue for a possible solution.

Or, you could switch to the beta branch, which should make sure that Sidekiq doesn’t stay paused when there’s an error during the backup.


(Florian Schmaus) #4

As I wrote I think there is a cure, it probably just takes a while until it hits discourse’s stable branch. And I want a simple workaround until then.

Dunno, we have 4 GiB RAM.


(Cintiadr) #5

My sidekiq was only being paused after attempted to delete old backups. In all fairness, just the first time after a restart.

The cause was simply lack of permissions to delete those backups from S3.

Your problem seems pretty different. I’d recommend taking a look on the logs.


(Florian Schmaus) #6

I’ve now setup a cronjob which runs every 15 minutes the following command:

docker exec app /bin/bash -c "echo Sidekiq.unpause! | rails c"

Let’s see if this achieves the desired effect.


(Gerhard Schlager) #7

That’s good to know, but I still don’t get why you are doing this. Do you know why your sidekiq process is paused?

Does it happen during backups where it fails to cleanup the S3 bucket? If so, then there are multiple options to fix this:

  1. Upgrade to the beta branch.
  2. Fix the policy as described in S3 Backup ... suspect access issue
  3. Disable S3 cleanup via s3_disable_cleanup site setting

If there’s another reason then you might wait a long time for “a cure”, because I’m not aware of any other fixes in the beta branch for such an issue. In that case you might want to investigate a little bit more why sidekiq is being paused.


(Florian Schmaus) #8

No. I’ve looked in the logs but as far as I can tell, the errors and warnings I see are unrelated to backup and/or sidekiq.

I have no indication when this happens. It would also be great if the dashboard would show the status of sidekiq (at least if it is paused). Right now, I’ve to visit the sidekiq status page, which is also not mentioned in the dashboard. I found out about its existence by accident.

We already did this.

That is not really a fix, it is a workaround, right?

I’m a little bit confused: Are there related commits in the beta branch that could fix this? If so, then please consider backporting them to stable.


(Jeff Atwood) #9

Not really, it usually means your setup is fragile. Update to beta if you are concerned and still having issues, otherwise, you get what you get until the next major release which is many months off.


(Florian Schmaus) #10

That is probably true. Although I believe your setup is pretty standard, nothing really special.

I also realize that I may have missed a related log entry in the web UI. I would be happy to provide our full logs if anyone wants to have a look at it. How can I export the logs from the log web UI (/logs)?

My unpause-sidekiq script invoked by cron currently looks like this

#!/usr/bin/env bash

docker exec app /bin/bash -c "echo Sidekiq.unpause! | rails c"

It could be improved so that there is only output if sidekiq was restarted (“Sidekiq was paused, and has been restarted”). Together with the cronjob run every 15 minutes I could at least narrow down the time window when this happens. But I unfortunately lack the ruby/rails skills to make the required changes.