Unfortunately, after running for half a day the same problem happened.
Here the docker was rebooted and at the line Thu Aug 29 15:58:34 UTC 2024 was switched live:
I, [2024-08-29T14:13:32.515947 #68] INFO -- : Refreshing Gem list
I, [2024-08-29T14:13:36.905305 #68] INFO -- : listening on addr=127.0.0.1:3000 fd=15
I, [2024-08-29T14:13:42.283988 #68] INFO -- : starting 1 supervised sidekiqs
I, [2024-08-29T14:13:46.693381 #317] INFO -- : Loading Sidekiq in process id 317
I, [2024-08-29T14:13:47.963367 #356] INFO -- : worker=0 ready
I, [2024-08-29T14:13:49.811765 #420] INFO -- : worker=1 ready
I, [2024-08-29T14:13:50.990474 #68] INFO -- : master process ready
I, [2024-08-29T14:13:51.394171 #538] INFO -- : worker=2 ready
I, [2024-08-29T14:13:52.632528 #688] INFO -- : worker=3 ready
I, [2024-08-29T14:13:53.468952 #824] INFO -- : worker=4 ready
I, [2024-08-29T14:13:54.848829 #945] INFO -- : worker=5 ready
I, [2024-08-29T14:13:55.276090 #1084] INFO -- : worker=6 ready
I, [2024-08-29T14:13:55.663733 #1206] INFO -- : worker=7 ready
*****Thu Aug 29 15:58:34 UTC 2024*****
I, [2024-08-29T16:19:42.701388 #2847461] INFO -- : Loading Sidekiq in process id 2847461
[STARTED]
'system' has started the backup!
Marking backup as running...
I’m not sure why after Loading Sidekiq in process id 2847461
there were no thte same INFO worker=
lines and whether this is significant.
The container ran fine until around midnight when the back upstarted. Here’s the log after the backup is completed (note there is plenty of RAM and disk space):
Finalizing backup...
Creating archive: discourse-2024-08-29-033737-v20240731190511.tar.gz
Making sure archive does not already exist...
Creating empty archive...
Archiving data dump...
Archiving uploads...
Removing tmp '/var/www/discourse/tmp/backups/default/2024-08-29-033737' directory...
Gzipping archive, this may take a while...
Executing the after_create_hook for the backup...
Deleting old backups...
Cleaning stuff up...
Removing '.tar' leftovers...
Marking backup as finished...
Refreshing disk stats...
Finished!
[SUCCESS]
I, [2024-08-29T07:39:13.994558 #74] INFO -- : Refreshing Gem list
I, [2024-08-29T07:39:22.139288 #74] INFO -- : listening on addr=127.0.0.1:3000 fd=15
I, [2024-08-29T07:39:28.538100 #74] INFO -- : starting 1 supervised sidekiqs
I, [2024-08-29T07:39:32.817733 #338] INFO -- : Loading Sidekiq in process id 338
I, [2024-08-29T07:39:33.888697 #377] INFO -- : worker=0 ready
I, [2024-08-29T07:39:35.956335 #449] INFO -- : worker=1 ready
I, [2024-08-29T07:39:37.233970 #74] INFO -- : master process ready
I, [2024-08-29T07:39:37.437223 #575] INFO -- : worker=2 ready
I, [2024-08-29T07:39:38.974154 #718] INFO -- : worker=3 ready
I, [2024-08-29T07:39:39.789086 #842] INFO -- : worker=4 ready
I, [2024-08-29T07:39:40.417719 #960] INFO -- : worker=5 ready
I, [2024-08-29T07:39:40.993249 #1084] INFO -- : worker=6 ready
I, [2024-08-29T07:39:41.395415 #1236] INFO -- : worker=7 ready
I, [2024-08-29T07:43:38.285891 #74] INFO -- : master reopening logs...
I, [2024-08-29T07:43:38.288650 #2790085] INFO -- : Sidekiq reopening logs...
I, [2024-08-29T07:43:38.658839 #454] INFO -- : worker=1 reopening logs...
I, [2024-08-29T07:43:53.315240 #845] INFO -- : worker=4 reopening logs...
I, [2024-08-29T07:43:53.315258 #1121415] INFO -- : worker=0 reopening logs...
I, [2024-08-29T07:43:53.318146 #720] INFO -- : worker=3 reopening logs...
I, [2024-08-29T07:43:53.318152 #1256] INFO -- : worker=7 reopening logs...
I, [2024-08-29T07:43:53.319042 #586] INFO -- : worker=2 reopening logs...
I, [2024-08-29T07:43:53.322139 #960] INFO -- : worker=5 reopening logs...
I, [2024-08-29T07:43:53.330282 #1109] INFO -- : worker=6 reopening logs...
E, [2024-08-29T09:24:44.381913 #377] ERROR -- : app error: Connection timed out (Redis::TimeoutError)
E, [2024-08-29T09:24:44.382304 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/connectio
n/ruby.rb:58:in `block in _read_from_socket'
E, [2024-08-29T09:24:44.382516 #377] ERROR -- : <internal:kernel>:187:in `loop'
E, [2024-08-29T09:24:44.382536 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/connectio
n/ruby.rb:54:in `_read_from_socket'
E, [2024-08-29T09:24:44.382687 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/connectio
n/ruby.rb:47:in `gets'
E, [2024-08-29T09:24:44.382782 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/connection/ruby.rb:382:in `read'
E, [2024-08-29T09:24:44.383064 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/client.rb:311:in `block in read'
E, [2024-08-29T09:24:44.383269 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/client.rb:299:in `io'
E, [2024-08-29T09:24:44.383293 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/client.rb:310:in `read'
E, [2024-08-29T09:24:44.383304 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/client.rb:161:in `block in call'
E, [2024-08-29T09:24:44.383313 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/client.rb:279:in `block (2 levels) in process'
E, [2024-08-29T09:24:44.383323 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/client.rb:411:in `ensure_connected'
E, [2024-08-29T09:24:44.383332 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/client.rb:269:in `block in process'
E, [2024-08-29T09:24:44.383341 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/client.rb:356:in `logging'
E, [2024-08-29T09:24:44.383349 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/client.rb:268:in `process'
E, [2024-08-29T09:24:44.383358 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/client.rb:161:in `call'
E, [2024-08-29T09:24:44.383366 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-mini-profiler-3.3.1/lib/mini_profiler/profiling_methods.rb:89:in `block in profile_method'
E, [2024-08-29T09:24:44.383375 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis.rb:270:in `block in send_command'
E, [2024-08-29T09:24:44.383384 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis.rb:269:in `synchronize'
E, [2024-08-29T09:24:44.383407 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis.rb:269:in `send_command'
E, [2024-08-29T09:24:44.383416 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/redis-4.8.1/lib/redis/commands/lists.rb:11:in `llen'
E, [2024-08-29T09:24:44.383425 #377] ERROR -- : /var/www/discourse/lib/rate_limiter.rb:192:in `is_under_limit?'
E, [2024-08-29T09:24:44.383434 #377] ERROR -- : /var/www/discourse/lib/rate_limiter.rb:72:in `can_perform?'
E, [2024-08-29T09:24:44.383443 #377] ERROR -- : /var/www/discourse/lib/middleware/request_tracker.rb:488:in `rate_limit'
E, [2024-08-29T09:24:44.383453 #377] ERROR -- : /var/www/discourse/lib/middleware/request_tracker.rb:322:in `call'
E, [2024-08-29T09:24:44.383461 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/actionpack-7.1.4/lib/action_dispatch/middleware/remote_ip.rb:92:in `call'
E, [2024-08-29T09:24:44.383470 #377] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/middleware.rb:26:in `call'
E, [2024-08-30T05:11:44.698172 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/railties-7.1.4/lib/rails/engine.rb:536:in `call'
E, [2024-08-30T05:11:44.698180 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/railties-7.1.4/lib/rails/railtie.rb:226:in `public_send'
E, [2024-08-30T05:11:44.698189 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/railties-7.1.4/lib/rails/railtie.rb:226:in `method_missing'
E, [2024-08-30T05:11:44.698199 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-2.2.9/lib/rack/urlmap.rb:74:in `block in call'
E, [2024-08-30T05:11:44.698207 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-2.2.9/lib/rack/urlmap.rb:58:in `each'
E, [2024-08-30T05:11:44.698216 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/rack-2.2.9/lib/rack/urlmap.rb:58:in `call'
E, [2024-08-30T05:11:44.698225 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/unicorn-6.1.0/lib/unicorn/http_server.rb:634:in `process_client'
E, [2024-08-30T05:11:44.698233 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/unicorn-6.1.0/lib/unicorn/http_server.rb:739:in `worker_loop'
E, [2024-08-30T05:11:44.698241 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/unicorn-6.1.0/lib/unicorn/http_server.rb:547:in `spawn_missing_workers'
E, [2024-08-30T05:11:44.698261 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/unicorn-6.1.0/lib/unicorn/http_server.rb:143:in `start'
E, [2024-08-30T05:11:44.698270 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/gems/unicorn-6.1.0/bin/unicorn:128:in `<top (required)>'
E, [2024-08-30T05:11:44.698291 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/bin/unicorn:25:in `load'
E, [2024-08-30T05:11:44.698300 #420] ERROR -- : /var/www/discourse/vendor/bundle/ruby/3.3.0/bin/unicorn:25:in `<main>'
After sidekiq is restarted after the backup, within 2 hours the Redis timeout errors begin. Not sure why within 2 hours. Perhaps this could have been the first loading of the site in the morning.
Any ideas? I thought monitoring the startup of the container would be enough, but if the error can be re-triggered after start/stop of sidekiq post-backup, then I need another strategy.
I also found this:
26 E, [2024-08-29T09:25:58.612452 #74] ERROR -- : worker=3 PID:718 running too long (29s), sending USR2 to dump thread backtraces
27 E, [2024-08-29T09:25:58.612563 #74] ERROR -- : worker=4 PID:842 running too long (29s), sending USR2 to dump thread backtraces
28 E, [2024-08-29T09:25:58.612587 #74] ERROR -- : worker=5 PID:960 running too long (29s), sending USR2 to dump thread backtraces
29 E, [2024-08-29T09:25:59.613766 #74] ERROR -- : worker=3 PID:718 running too long (30s), sending USR2 to dump thread backtraces
30 E, [2024-08-29T09:25:59.613874 #74] ERROR -- : worker=4 PID:842 running too long (30s), sending USR2 to dump thread backtraces
31 E, [2024-08-29T09:25:59.613897 #74] ERROR -- : worker=5 PID:960 running too long (30s), sending USR2 to dump thread backtraces
32 E, [2024-08-29T09:25:59.613931 #74] ERROR -- : worker=6 PID:1084 running too long (29s), sending USR2 to dump thread backtraces