3.1.0.beta5 - [09feb03056]
default_watch_categories was set from default (which is nothing iirc) to a single category, probably for the first time ever.
This triggered a massive volume of emails, over a period of 3 days almost 400K email reties may have been processed, over 09% failing. Very early on this triggered rate limiting on the smtp providers side (which saved a massive unintended email bill!).
default_watch_categories setting was reset as it was set inadvertently.
However sidekiq was still showing a huge volume of email retires, because the smtp provider rate limits were still in effect, it kept trying, and failing.
On closer inspection, it looked as though it was now just one post continually triggering the same email over and over, in the hundreds and then thousands of reties, even though the
default_watch_categories had been reset to default. What was going on?
Once that post and others below it were split into their own “new” topic for normal moderation reasons the email trigger and retries stopped dead as a bonus. No more retries.
Thanks heavens for smtp providers rate limiting!
As an extra insight - this only got detected when it failed, due to the smtp providers limiters, maybe some clearer readout in the dashboard of the days email send activity for last 7 days, 24 hours, and hour, with alerts for any spikes, may be welcome.
This kind of thing could easily get you into a lot of trouble financially.
This could have chewed up our years hosting costs in a couple of days had the provider not had rate limits kick in!
Could you please check for me this site setting on your instance?
This is a default setting for users. Users can override it in their profile:
What I think might happen:
- You have
only when away default setting;
- Topic was created in watched category. System wanted to notify all/only away users;
- Bunch of Sidekiq jobs were lodged which were crashing because of rate limits;
- Default watched category was removed, but this is not removing already enqueued Sidekiq jobs;
- Sidekiq retires failed jobs;
- When the post was moved to a new topic, previously wrongly created notifications were deleted;
- When notifications were deleted, retrying jobs were completed successfully without sending emails.
default_email_level : only when away
deafult_email_message_level : never