Foro autoalojado presenta frecuentes errores 502 y 503

Mi foro autoalojado (https://intfiction.org/) ha comenzado a ser muy lento en las últimas semanas, y frecuentemente presenta errores 502 y 503.

Estoy en la rama estable, versión 2026.1.5.

Buscar esos errores aquí no ha revelado ninguna causa probable:

  • el foro se compila y lanza correctamente
  • el servidor parece tener suficiente capacidad (13,5 GB de espacio en disco libre, 1,6/1,9 GB de RAM, 1,7/2,2 GB de swap)

Al revisar htop, el promedio de carga sí parece alto: 1.67, 1.55, 3.12. Este proceso a veces usa más del 40% de la CPU: unicorn worker[0] -E production -c config/unicorn.conf.rb.

En los registros del foro estoy viendo muchos errores como estos:

Job exception: execution expired

net-smtp-0.5.1/lib/net/smtp.rb:663:in 'TCPSocket#initialize' 
net-smtp-0.5.1/lib/net/smtp.rb:663:in 'IO.open' 
net-smtp-0.5.1/lib/net/smtp.rb:663:in 'Net::SMTP#tcp_socket' 
net-smtp-0.5.1/lib/net/smtp.rb:672:in 'block in Net::SMTP#do_start' 
timeout-0.5.0/lib/timeout.rb:222:in 'block in Timeout.timeout' 
timeout-0.5.0/lib/timeout.rb:229:in 'Timeout.timeout' 
net-smtp-0.5.1/lib/net/smtp.rb:671:in 'Net::SMTP#do_start' 
net-smtp-0.5.1/lib/net/smtp.rb:642:in 'Net::SMTP#start' 
mail-2.9.0/lib/mail/network/delivery_methods/smtp.rb:154:in 'Mail::SMTP#start_smtp_session' 
mail-2.9.0/lib/mail/network/delivery_methods/smtp.rb:108:in 'Mail::SMTP#deliver!' 
mail-2.9.0/lib/mail/message.rb:269:in 'Mail::Message#deliver!' 
/usr/local/lib/ruby/3.4.0/delegate.rb:87:in 'Delegator#method_missing'
/var/www/discourse/lib/email/sender.rb:296:in 'Email::Sender#send' 
/var/www/discourse/lib/email/processor.rb:151:in 'Email::Processor#handle_failure' 
/var/www/discourse/lib/email/processor.rb:31:in 'Email::Processor#process!' 
/var/www/discourse/lib/email/processor.rb:13:in 'Email::Processor.process!' 
/var/www/discourse/app/jobs/regular/process_email.rb:8:in 'Jobs::ProcessEmail#execute' 
/var/www/discourse/app/jobs/base.rb:318:in 'block (2 levels) in Jobs::Base#perform' 
rails_multisite-7.0.0/lib/rails_multisite/connection_management/null_instance.rb:49:in 'RailsMultisite::ConnectionManagement::NullInstance#with_connection'
rails_multisite-7.0.0/lib/rails_multisite/connection_management.rb:17:in 'RailsMultisite::ConnectionManagement.with_connection'
/var/www/discourse/app/jobs/base.rb:305:in 'block in Jobs::Base#perform' 
/var/www/discourse/app/jobs/base.rb:301:in 'Array#each' 
/var/www/discourse/app/jobs/base.rb:301:in 'Jobs::Base#perform' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:220:in 'Sidekiq::Processor#execute_job' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:185:in 'block (4 levels) in Sidekiq::Processor#process' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:180:in 'Sidekiq::Middleware::Chain#traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in 'block in Sidekiq::Middleware::Chain#traverse' 
/var/www/discourse/lib/sidekiq/suppress_user_email_errors.rb:6:in 'Sidekiq::SuppressUserEmailErrors#call' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in 'Sidekiq::Middleware::Chain#traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in 'block in Sidekiq::Middleware::Chain#traverse' 
/var/www/discourse/lib/sidekiq/discourse_event.rb:6:in 'Sidekiq::DiscourseEvent#call' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in 'Sidekiq::Middleware::Chain#traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in 'block in Sidekiq::Middleware::Chain#traverse' 
/var/www/discourse/lib/sidekiq/pausable.rb:131:in 'Sidekiq::Pausable#call' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in 'Sidekiq::Middleware::Chain#traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in 'block in Sidekiq::Middleware::Chain#traverse' 
sidekiq-7.3.9/lib/sidekiq/job/interrupt_handler.rb:9:in 'Sidekiq::Job::InterruptHandler#call' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in 'Sidekiq::Middleware::Chain#traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:183:in 'block in Sidekiq::Middleware::Chain#traverse' 
sidekiq-7.3.9/lib/sidekiq/metrics/tracking.rb:26:in 'Sidekiq::Metrics::ExecutionTracker#track' 
sidekiq-7.3.9/lib/sidekiq/metrics/tracking.rb:134:in 'Sidekiq::Metrics::Middleware#call' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:182:in 'Sidekiq::Middleware::Chain#traverse' 
sidekiq-7.3.9/lib/sidekiq/middleware/chain.rb:173:in 'Sidekiq::Middleware::Chain#invoke' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:184:in 'block (3 levels) in Sidekiq::Processor#process' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:145:in 'block (6 levels) in Sidekiq::Processor#dispatch' 
sidekiq-7.3.9/lib/sidekiq/job_retry.rb:118:in 'Sidekiq::JobRetry#local' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:144:in 'block (5 levels) in Sidekiq::Processor#dispatch' 
sidekiq-7.3.9/lib/sidekiq/config.rb:39:in 'block in <class:Config>' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:139:in 'block (4 levels) in Sidekiq::Processor#dispatch' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:281:in 'Sidekiq::Processor#stats' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:134:in 'block (3 levels) in Sidekiq::Processor#dispatch' 
sidekiq-7.3.9/lib/sidekiq/job_logger.rb:15:in 'Sidekiq::JobLogger#call' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:133:in 'block (2 levels) in Sidekiq::Processor#dispatch' 
sidekiq-7.3.9/lib/sidekiq/job_retry.rb:85:in 'Sidekiq::JobRetry#global' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:132:in 'block in Sidekiq::Processor#dispatch' 
sidekiq-7.3.9/lib/sidekiq/job_logger.rb:40:in 'Sidekiq::JobLogger#prepare' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:131:in 'Sidekiq::Processor#dispatch' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:183:in 'block (2 levels) in Sidekiq::Processor#process' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:182:in 'Thread.handle_interrupt' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:182:in 'block in Sidekiq::Processor#process' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:181:in 'Thread.handle_interrupt' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:181:in 'Sidekiq::Processor#process' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:86:in 'Sidekiq::Processor#process_one' 
sidekiq-7.3.9/lib/sidekiq/processor.rb:76:in 'Sidekiq::Processor#run' 
sidekiq-7.3.9/lib/sidekiq/component.rb:10:in 'Sidekiq::Component#watchdog' 
sidekiq-7.3.9/lib/sidekiq/component.rb:19:in 'block in Sidekiq::Component#safe_thread' 

También estoy viendo un montón de mensajes sobre Redis:

Job exception: Connection timed out - user specified timeout: 1.0s (redis://localhost:6379)

Your Redis network connection is performing extremely poorly.
Last RTT readings were [403941, 62224, 151840, 1536008, 3440226], ideally these should be < 1000.
Ensure Redis is running in the same AZ o

Pero no sé si esto indica que Redis es la causa o simplemente un síntoma.

Acabo de ejecutar ./launcher rebuild app, y no es mejor. ¿Alguna idea sobre cómo diagnosticar qué está fallando en este servidor?

Te sugeriría aumentar un poco más la RAM primero, eso podría ayudar.

Acabo de notar algunos más mientras observaba top:

postgres: 15/main: discourse discourse [local] idle consumía hasta un 90 % de la CPU, y lo ha estado haciendo durante más de un minuto.
Edición: el foro estaba realizando su copia de seguridad diaria, así que creo que esa es la causa de este caso.

Estoy ejecutando el receptor de correo en este servidor, y parece que recibe oleadas de spam:

Esas oleadas de correos electrónicos de spam entrantes no van a ayudar.

Pero estoy de acuerdo: añade memoria RAM si puedes. Duplica la cantidad.