Jobs::UserEmail keeps piling up and exhaust the memory

(David Cheng) #1

Hi Discourse support,

Our Discourse installation recently are running into an issue that lots Jobs::UserEmail keeps getting enqueued way faster than the speed that are getting processed. The jobs will get enqueued every 30mins as the routine scanning process for digest are getting executed. The number of jobs are getting increased in a crazy fast manner so we could easily ended up queuing millions of jobs to be executed within hours then eventually crash the Redis and halt the entire system. Our system only has few hundred K users so apparently there are new jobs for the same user getting enqueued before the previous ones are processed.

We are currently using Discourse 1.7 stable which had just been upgraded from beta. We did not have this issue before the upgrade and kind of lost right now how we could get this issue addressed. The only way for us to stabilize the system is turn off the weekly digest by setting the day to 0. However we do need this feature to work per our marketing requirements.

Could you help?

Thanks, David

(Matt McNeil) #2

I too am having the same issue since upgrading to 1.7.3. Here is a graph of my Sidekiq queue. For months it had been stable at around 1000 jobs per day, then it spike up to 20 million and is wreaking all kinds of havoc. I am running on a 2GB Digital Ocean droplet with with web and data Docker containers. Any idea @sam of what could be going on? Thanks!

(Sam Saffron) #3

Maybe a rebake is going on, I don’t know, what kind of jobs are running?

(Matt McNeil) #4

@sam: They all seem to be Jobs:UserEmail:

Similar to @David_Cheng, my instance had been running smoothly for months and this issue started right after the 1.6.x => 1.7.x upgrade. Also, our system only has around 6k users, so is it possible that 20M jobs could be generated in one day’s normal operation?

Any tips on which logs or other places I could look to troubleshoot? Thanks!