I’m not the sys admin of the AWS EC2 instance running our Discourse instance, but I’m the admin of the discourse instance itself. We had an AWS SES email service shutdown 3 weeks ago for security reasons. Our cloud staff is only fixing it now. So for 3 weeks, our site could not send emails and i’m seeing more than 40000 of failed jobs and as many retries. I’m not a web developer so I don’t understand what the Sidekiq page is telling, but I’m worried the failed jobs will be retried when our email server is back online, flooding people with outdated emails they didn’t get for 3 weeks. Will that be the case? Does Discourse resend emails that could not be sent due to an email server being offline? If so, how can I disable that to avoid flooding people’s with emails from our site? Can we adjust the granularity? Say only send emails showing new activities since some given date?
Your fear is valid.
I’m not sure how much time you have to fix this? One solution could be to set up and configure a mail server that accepts emails but just throws them away.
The really quick and (very) dirty way to resolve this is to use redis-cli and issue a flushdb command. That will remove all queued jobs. It will also log out all users. Then reboot your Discourse to make sure all regular jobs run again.
Logging out all users is certainly not desirable… The email server should be fixed today, but I’m not sure if our sys admins will have the flexibility to setup the email server to throw everything away.
I’m seeing a “kill all” & “delete all” button at the bottom of the “retries” page of sidekiq (see attached). Is that something that can help?
Purging all jobs from the queue of a certain type should do the trick.
(I would have to go back and try to dig out how to do this…)
I think you are sure. They took three weeks to fix it at all.
You could ask if they could Google how to purge jobs from sidekiq and delete the mail jobs. I think that’s your best bet.
I’m guessing you don’t have access to do it yourself or hire anyone to help. Can you ssh into the ec2 that it’s running on? You could endeavor to delete all 50k from the web interface.
The sidekick page with the kill/delete options worked. No EC2 sys admin was needed, being forum’s admin was enough to operate from the sidekick page, I could delete all queued emails.
After the email server was back online, no “queued” email was resent.