My self hosted forum (https://intfiction.org/) has the past couple of weeks started getting very slow, and it frequently has 502 and 503 errors.
I’m on the stable branch, version 2026.1.5.
Searching for those errors here hasn’t uncovered any likely causes:
- the forum builds and launches fine
- the server seems to have enough capacity (13.5GB disk space free, 1.6/1.9G RAM, 1.7/2.2G swap)
Checking htop, the load average does look high though: 1.67, 1.55, 3.12. This process is sometimes using over 40% CPU: unicorn worker[0] -E production -c config/unicorn.conf.rb.
In the forum logs I’m seeing a lot of errors like these:
Job exception: execution expired
net-smtp-0.5.1/lib/net/smtp.rb:663:in 'TCPSocket#initialize'
net-smtp-0.5.1/lib/net/smtp.rb:663:in 'IO.open'
net-smtp-0.5.1/lib/net/smtp.rb:663:in 'Net::SMTP#tcp_socket'
net-smtp-0.5.1/lib/net/smtp.rb:672:in 'block in Net::SMTP#do_start'
timeout-0.5.0/lib/timeout.rb:222:in 'block in Timeout.timeout'
timeout-0.5.0/lib/timeout.rb:229:in 'Timeout.timeout'
net-smtp-0.5.1/lib/net/smtp.rb:671:in 'Net::SMTP#do_start'
net-smtp-0.5.1/lib/net/smtp.rb:642:in 'Net::SMTP#start'
mail-2.9.0/lib/mail/network/delivery_methods/smtp.rb:154:in 'Mail::SMTP#start_smtp_session'
mail-2.9.0/lib/mail/network/delivery_methods/smtp.rb:108:in 'Mail::SMTP#deliver!'
mail-2.9.0/lib/mail/message.rb:269:in 'Mail::Message#deliver!'
/usr/local/lib/ruby/3.4.0/delegate.rb:87:in 'Delegator#method_missing'
/var/www/discourse/lib/email/sender.rb:296:in 'Email::Sender#send'
/var/www/discourse/lib/email/processor.rb:151:in 'Email::Processor#handle_failure'
/var/www/discourse/lib/email/processor.rb:31:in 'Email::Processor#process!'
/var/www/discourse/lib/email/processor.rb:13:in 'Email::Processor.process!'
/var/www/discourse/app/jobs/regular/process_email.rb:8:in 'Jobs::ProcessEmail#execute'
/var/www/discourse/app/jobs/base.rb:318:in 'block (2 levels) in Jobs::Base#perform'
rails_multisite-7.0.0/lib/rails_multisite/connection_management/null_instance.rb:49:in 'RailsMultisite::ConnectionManagement::NullInstance#with_connection'
rails_multisite-7.0.0/lib/rails_multisite/connection_management.rb:17:in 'RailsMultisite::ConnectionManagement.with_connection'
/var/www/discourse/app/jobs/base.rb:305:in 'block in Jobs::Base#perform'
/var/www/discourse/app/jobs/base.rb:301:in 'Array#each'
/var/www/discourse/app/jobs/base.rb:301:in 'Jobs::Base#perform'
sidekiq-7.3.9/lib/sidekiq/processor.rb:220:in 'Sidekiq::Processor#execute_job'
sidekiq-7.3.9/lib/sidekiq/processor.rb:185:in 'block (4 levels) in Sidekiq::Processor#process'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:180:in 'Sidekiq::Middleware::Chain#traverse'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in 'block in Sidekiq::Middleware::Chain#traverse'
/var/www/discourse/lib/sidekiq/suppress_user_email_errors.rb:6:in 'Sidekiq::SuppressUserEmailErrors#call'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in 'Sidekiq::Middleware::Chain#traverse'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in 'block in Sidekiq::Middleware::Chain#traverse'
/var/www/discourse/lib/sidekiq/discourse_event.rb:6:in 'Sidekiq::DiscourseEvent#call'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in 'Sidekiq::Middleware::Chain#traverse'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in 'block in Sidekiq::Middleware::Chain#traverse'
/var/www/discourse/lib/sidekiq/pausable.rb:131:in 'Sidekiq::Pausable#call'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in 'Sidekiq::Middleware::Chain#traverse'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in 'block in Sidekiq::Middleware::Chain#traverse'
sidekiq-7.3.9/lib/sidekiq/job/interrupt_handler.rb:9:in 'Sidekiq::Job::InterruptHandler#call'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in 'Sidekiq::Middleware::Chain#traverse'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in 'block in Sidekiq::Middleware::Chain#traverse'
sidekiq-7.3.9/lib/sidekiq/metrics/tracking.rb:26:in 'Sidekiq::Metrics::ExecutionTracker#track'
sidekiq-7.3.9/lib/sidekiq/metrics/tracking.rb:134:in 'Sidekiq::Metrics::Middleware#call'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in 'Sidekiq::Middleware::Chain#traverse'
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:173:in 'Sidekiq::Middleware::Chain#invoke'
sidekiq-7.3.9/lib/sidekiq/processor.rb:184:in 'block (3 levels) in Sidekiq::Processor#process'
sidekiq-7.3.9/lib/sidekiq/processor.rb:145:in 'block (6 levels) in Sidekiq::Processor#dispatch'
sidekiq-7.3.9/lib/sidekiq/job_retry.rb:118:in 'Sidekiq::JobRetry#local'
sidekiq-7.3.9/lib/sidekiq/processor.rb:144:in 'block (5 levels) in Sidekiq::Processor#dispatch'
sidekiq-7.3.9/lib/sidekiq/config.rb:39:in 'block in <class:Config>'
sidekiq-7.3.9/lib/sidekiq/processor.rb:139:in 'block (4 levels) in Sidekiq::Processor#dispatch'
sidekiq-7.3.9/lib/sidekiq/processor.rb:281:in 'Sidekiq::Processor#stats'
sidekiq-7.3.9/lib/sidekiq/processor.rb:134:in 'block (3 levels) in Sidekiq::Processor#dispatch'
sidekiq-7.3.9/lib/sidekiq/job_logger.rb:15:in 'Sidekiq::JobLogger#call'
sidekiq-7.3.9/lib/sidekiq/processor.rb:133:in 'block (2 levels) in Sidekiq::Processor#dispatch'
sidekiq-7.3.9/lib/sidekiq/job_retry.rb:85:in 'Sidekiq::JobRetry#global'
sidekiq-7.3.9/lib/sidekiq/processor.rb:132:in 'block in Sidekiq::Processor#dispatch'
sidekiq-7.3.9/lib/sidekiq/job_logger.rb:40:in 'Sidekiq::JobLogger#prepare'
sidekiq-7.3.9/lib/sidekiq/processor.rb:131:in 'Sidekiq::Processor#dispatch'
sidekiq-7.3.9/lib/sidekiq/processor.rb:183:in 'block (2 levels) in Sidekiq::Processor#process'
sidekiq-7.3.9/lib/sidekiq/processor.rb:182:in 'Thread.handle_interrupt'
sidekiq-7.3.9/lib/sidekiq/processor.rb:182:in 'block in Sidekiq::Processor#process'
sidekiq-7.3.9/lib/sidekiq/processor.rb:181:in 'Thread.handle_interrupt'
sidekiq-7.3.9/lib/sidekiq/processor.rb:181:in 'Sidekiq::Processor#process'
sidekiq-7.3.9/lib/sidekiq/processor.rb:86:in 'Sidekiq::Processor#process_one'
sidekiq-7.3.9/lib/sidekiq/processor.rb:76:in 'Sidekiq::Processor#run'
sidekiq-7.3.9/lib/sidekiq/component.rb:10:in 'Sidekiq::Component#watchdog'
sidekiq-7.3.9/lib/sidekiq/component.rb:19:in 'block in Sidekiq::Component#safe_thread'
I’m also seeing a bunch of messages about Redis:
Job exception: Connection timed out - user specified timeout: 1.0s (redis://localhost:6379)
Your Redis network connection is performing extremely poorly.
Last RTT readings were [403941, 62224, 151840, 1536008, 3440226], ideally these should be < 1000.
Ensure Redis is running in the same AZ o
But I don’t know if these are showing that Redis is the cause or just a symptom.
I’ve just run ./launcher rebuild app, and it’s not any better. Any ideas for how to diagnose what is going wrong on this server?
