Do you know if the Docker bug is still around in 18.03.1~ce-0~ubuntu? I am seeing the following and am wondering if this is the same issue again:
production.log
Unexpected error in Message Bus : Connection timed out
Job exception: Connection timed out
Failed to process job: Connection timed out ["/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/connection/hiredis.rb:58:in `rescue in read'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/connection/hiredis.rb:53:in `read'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/client.rb:260:in `block in read'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/client.rb:248:in `io'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/client.rb:259:in `read'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/client.rb:118:in `block in call'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/client.rb:229:in `block (2 levels) in process'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/client.rb:366:in `ensure_connected'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/client.rb:219:in `block in process'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/client.rb:304:in `logging'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/client.rb:218:in `process'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis/client.rb:118:in `call'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis.rb:2448:in `block in _eval'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis.rb:45:in `block in synchronize'", "/usr/local/lib/ruby/2.5.0/monitor.rb:226:in `mon_synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis.rb:45:in `synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis.rb:2447:in `_eval'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/redis-4.0.1/lib/redis.rb:2499:in `evalsha'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/message_bus-2.1.5/lib/message_bus/backends/redis.rb:381:in `cached_eval'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/message_bus-2.1.5/lib/message_bus/backends/redis.rb:141:in `publish'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/message_bus-2.1.5/lib/message_bus.rb:253:in `publish'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/message_bus-2.1.5/lib/message_bus.rb:490:in `block in new_subscriber_thread'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/message_bus-2.1.5/lib/message_bus/timer_thread.rb:102:in `do_work'", "/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/message_bus-2.1.5/lib/message_bus/timer_thread.rb:30:in `block in initialize'"]
Unexpected error in Message Bus : Connection timed out
Job exception: Connection timed out
unicorn.stderr.log
E, [2018-06-27T11:55:52.507207 #31436] ERROR -- : master loop error: Connection timed out (Redis::TimeoutError)
E, [2018-06-27T11:55:52.507357 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/connection/hiredis.rb:58:in `rescue in read'
E, [2018-06-27T11:55:52.507423 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/connection/hiredis.rb:53:in `read'
E, [2018-06-27T11:55:52.507464 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:260:in `block in read'
E, [2018-06-27T11:55:52.507503 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:248:in `io'
E, [2018-06-27T11:55:52.507536 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:259:in `read'
E, [2018-06-27T11:55:52.507560 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:118:in `block in call'
E, [2018-06-27T11:55:52.507583 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:229:in `block (2 levels) in process'
E, [2018-06-27T11:55:52.507609 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:366:in `ensure_connected'
E, [2018-06-27T11:55:52.507630 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:219:in `block in process'
E, [2018-06-27T11:55:52.507655 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:304:in `logging'
E, [2018-06-27T11:55:52.507674 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:218:in `process'
E, [2018-06-27T11:55:52.507698 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:118:in `call'
E, [2018-06-27T11:55:52.507716 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis.rb:889:in `block in get'
E, [2018-06-27T11:55:52.507736 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis.rb:45:in `block in synchronize'
E, [2018-06-27T11:55:52.507753 #31436] ERROR -- : /usr/local/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
E, [2018-06-27T11:55:52.507781 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis.rb:45:in `synchronize'
E, [2018-06-27T11:55:52.507804 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-4.0.1/lib/redis.rb:888:in `get'
E, [2018-06-27T11:55:52.507822 #31436] ERROR -- : /var/www/discourse/lib/discourse_redis.rb:188:in `block (3 levels) in <class:DiscourseRedis>'
E, [2018-06-27T11:55:52.507838 #31436] ERROR -- : /var/www/discourse/lib/discourse_redis.rb:153:in `ignore_readonly'
E, [2018-06-27T11:55:52.507855 #31436] ERROR -- : /var/www/discourse/lib/discourse_redis.rb:188:in `block (2 levels) in <class:DiscourseRedis>'
E, [2018-06-27T11:55:52.507876 #31436] ERROR -- : /var/www/discourse/app/jobs/regular/run_heartbeat.rb:15:in `last_heartbeat'
E, [2018-06-27T11:55:52.507896 #31436] ERROR -- : config/unicorn.conf.rb:172:in `check_sidekiq_heartbeat'
E, [2018-06-27T11:55:52.507915 #31436] ERROR -- : config/unicorn.conf.rb:199:in `master_sleep'
E, [2018-06-27T11:55:52.507933 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/unicorn-5.4.0/lib/unicorn/http_server.rb:294:in `join'
E, [2018-06-27T11:55:52.507954 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/gems/unicorn-5.4.0/bin/unicorn:126:in `<top (required)>'
E, [2018-06-27T11:55:52.507971 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/bin/unicorn:23:in `load'
E, [2018-06-27T11:55:52.507987 #31436] ERROR -- : /var/www/discourse/vendor/bundle/ruby/2.4.0/bin/unicorn:23:in `<main>'
Failed to report error: Connection timed out 3 Job exception: Connection timed out
apt-cache showpkg docker-ce shows me versions 17.09 and 17.12, but 17.10 is not available in my repo. Are you aware of any other version besides 17.10 which does not have problems with Discourse? (assuming mine is the same problem)
Edit: Some more info: Discourse was running rock-solid since a month, no problems. Yesterday evening I rebuilt the app and did a upgrade / dist-upgrade during a short maint window. 18 hours later the forum died. Host ressources are fine.
Now I restored a snapshot from before the upgrade, and saw that I was running the exact same Docker version before the problems started. This pretty much excludes the Docker version as the culprit. The other variables are app upgrade from 1.9 to 2.1. Also, updated host OS. Still, from the logs, I don’t see how the app could cause such low level problems (cant read Redis). So what is going on here? Maybe some dependencies of Docker changed while updating, which are causing the problem? Something inside the container going wrong? I am clueless…
Edit 2: This is what was going on from a resources point of view when it crashed:
But there wasn’t much going on at that point in forum. Something spiked memory though, albeit not in critical range.