Email failed jobs

Hello!

Discourse build : 3.5.0.beta2-dev(176ee0bf60)
Hosted on : VPS - Centminmod (131.00stable)on Alma8
Issue : Email failing periodically

I have two vHosts on this VPS. One with Xenforo , one with Discourse.

My Xenforo hosts happily sends emails 24/7 without issue. The Discourse though seems to fail every ~24h or so with “There are [number that increases] email jobs that failed. Check your app.yml and ensure that the mail server settings are correct. See the failed jobs in Sidekiq.”

I can “solve” the problem temporarily by restarting the docker service. Mail flow resumes.
I am sure the mail settings are correct. Once the docker service is restarted I can visit admin → email → server setup & logs → settings and fire off an email.

Once it fails I cannot.

I am seeing lots of Sidekiq is consuming too much memory ( using 5xxM ) for Fastserver-app restarting

activesupport-7.2.2.1/lib/active_support/broadcast_logger.rb:130:in `block in warn' 
activesupport-7.2.2.1/lib/active_support/broadcast_logger.rb:231:in `block in dispatch' 
activesupport-7.2.2.1/lib/active_support/broadcast_logger.rb:231:in `each' 
activesupport-7.2.2.1/lib/active_support/broadcast_logger.rb:231:in `dispatch' 
activesupport-7.2.2.1/lib/active_support/broadcast_logger.rb:130:in `warn' 
/var/www/discourse/lib/demon/sidekiq.rb:55:in `block in rss_memory_check' 
/var/www/discourse/lib/demon/sidekiq.rb:49:in `each' 
/var/www/discourse/lib/demon/sidekiq.rb:49:in `rss_memory_check' 
config/unicorn.conf.rb:132:in `block (2 levels) in reload'

I can also see Job exception: no address for meta.discourse.org(ResolvError)

excon-1.2.4/lib/excon/socket.rb:191:in `connect' 
excon-1.2.4/lib/excon/ssl_socket.rb:194:in `connect' 
excon-1.2.4/lib/excon/socket.rb:60:in `initialize' 
excon-1.2.4/lib/excon/ssl_socket.rb:10:in `initialize' 
excon-1.2.4/lib/excon/connection.rb:487:in `new' 
excon-1.2.4/lib/excon/connection.rb:487:in `socket' 
excon-1.2.4/lib/excon/connection.rb:120:in `request_call' 
excon-1.2.4/lib/excon/middlewares/mock.rb:57:in `request_call' 
excon-1.2.4/lib/excon/middlewares/instrumentor.rb:34:in `request_call' 
excon-1.2.4/lib/excon/middlewares/idempotent.rb:19:in `request_call' 
excon-1.2.4/lib/excon/middlewares/base.rb:22:in `request_call' 
excon-1.2.4/lib/excon/middlewares/decompress.rb:14:in `request_call' 
excon-1.2.4/lib/excon/middlewares/base.rb:22:in `request_call' 
excon-1.2.4/lib/excon/connection.rb:293:in `request' 
/var/www/discourse/lib/discourse_updates.rb:136:in `new_features_payload' 
/var/www/discourse/app/jobs/scheduled/check_new_features.rb:24:in `execute' 
/var/www/discourse/app/jobs/base.rb:316:in `block (2 levels) in perform' 
rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
/var/www/discourse/app/jobs/base.rb:303:in `block in perform' 
/var/www/discourse/app/jobs/base.rb:299:in `each' 
/var/www/discourse/app/jobs/base.rb:299:in `perform' 
/var/www/discourse/app/jobs/base.rb:379:in `perform' 
mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:137:in `process_queue' 
mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:77:in `worker_loop' 
mini_scheduler-0.18.0/lib/mini_scheduler/manager.rb:63:in `block (2 levels) in ensure_worker_threads' 

I havent changed much on the config of this server for some time regarding the docker. I have updated the kernel / php and other services that are outside of this docker.

The issue has become more frequently recently since I updated the discourse build. It was stable prior.

I have 8.8.8.8 and 8.8.4.4 as the DNS.

Any pointers would be appreciated!

If Sidekiq consumes too much memory, it can cause Discourse to restart, which may interrupt scheduled email jobs. Discourse includes an automatic restart feature if Sidekiq’s memory usage exceeds a defined threshold.

To address this, check the UNICORN_SIDEKIQ_MAX_RSS setting in your app.yml file. If the value is too low, consider increasing it.

For further discussion on this issue, you can refer to this topic:
Sidekiq is consuming too much memory - restarting.

2 Likes

Ill adjust that setting now and revert if I carry on getting these issues

1 Like

Urgh, just over 24h later and I have failed email …

Jobs::HandledExceptionWrapper: Wrapped Net::OpenTimeout: execution expired
1 Like

Ensure the SMTP server is reachable from your discourse instance
telnet DISCOURSE_SMTP_ADDRESS DISCOURSE_SMTP_PORT

1 Like

I’ll await the failure again and retry.

I have a xenforo non docker install on the same VPS and it doesn’t complain.

Will report back. I appreciate your guidance so far

2 Likes

I can get to the smtp server.

2 Likes

Had a couple of failure in quick succession then nothing for 8hrs or so now

1 Like