Email Hostname Certificate Mismatch Causing sidekiq Queue Overload, Severe Site Instability

Geoffrey_Challen · May 25, 2022, 2:39pm

Nice job @RGJ!

While we anticipate a fix, on a side note, it would be good if this problem didn’t cause the cascade of issues that I experienced, which nearly brought by forum down completely. Specifically:

The email failures seem to be retried extremely quickly, which causes the sidekiq queue to explode in size and ~100% CPU usage caused by these tasks
In addition, something (either crashes or restarts) was causing Redis to write enormous tmp files, I assume containing the state of the sidekiq queue. While these were safe to remove, they quickly filled the disk, which cause more crashes, and so on. I had some other disk space that I was able to free so that I could restart the forum and figure out what was going on, but this may not be true for everyone. (It’s also somewhat hard to confirm that, in this case, the Redis tmp files are in fact safe to delete.)

My guess is that the simplest solution here is to slow down the retry on failed email jobs—or at least on ones that don’t have timeliness constraints like password resets. Which seems appropriate given that email problems are unlikely to resolve quickly, and most / all mailers will do their own retries once they receive a message.

Topic		Replies	Views
E-mail sending stopped working after upgrade Installation	3	766	August 16, 2022
Email not working (SSL_connect returned=1) Installation	16	2462	July 15, 2022
Discourse with other websites, SMTP issue: End of file reached Installation	8	2296	October 22, 2020
Troubleshoot email on a new Discourse install Self-Hosting email , configuring , how-to	28	176996	June 15, 2025
Issues with Discourse 3.5.0.beta2-dev - SMTP and Background Jobs Bug email	16	265	March 6, 2025

Email Hostname Certificate Mismatch Causing sidekiq Queue Overload, Severe Site Instability

Related topics