Jobs are stuck in Sidekiq after a restart

sam · 12 april 2016 om 12:57

I meant to post this to @mperham on twitter but it does not really fit in a tweet Quite a few times we have notice a “stuck” sidekiq. All the jobs in the queue appear to be “stuck” in a running state forever.

I managed to isolate what is causing it today and would like some tips on how to correct it. It appears the “identity” of job’s owner is determined based on hostname:PID If you hang-up a processor Sidekiq “detects” that a hostname:PID is not around about 20-30 seconds after its is terminated. When that happens it will clear all the jobs it thought that hostname:PID owned.

However in a docker world we are able to do very quick restarts that very often maintain the same hostname:pid after restart before Sidekiq detects that the old job processor died. This sequence leaves us with jobs that appear to be running forever.

This issue has popped up many times on many customers and internally, but we never really picked up on the reason till now. I guess my questions are:

Can we amend sidekiq so it uses an additional piece of data for job processor identity, eg: hostname:PID:guid ?
Can we improve our shutdown sequence to cleanly clear the currently running jobs?
Any other ideas? (timeout can work here, but it will have a very delayed effect)

Note: we can not use any of the Sidekiq professional or enterprise features as we need the fix to apply to all Discourse open source users.

sam · 12 april 2016 om 13:12

Per @mperham’s request

https://twitter.com/mperham/status/719872587640123393

https://github.com/mperham/sidekiq/issues/2920

Topic		Antwoorden	Weergaven
Sidekiq keeps restarting, how to troubleshoot? Installation sidekiq	2	82	6 december 2025
Sidekiq dying and not coming back up Installation	6	1375	24 april 2022
Long Running Sidekiq Job Restarting Internal Code Dev	11	1646	22 juni 2020
Sidekiq stops after some time Installation	8	1131	14 juli 2023
"Ensure sidekiq is running." when it is definitely running Installation	19	7756	24 oktober 2015

Jobs are stuck in Sidekiq after a restart

Gerelateerde topics