Discourse web interface becomes unresponsive a few minutes after starting

mpalmer · December 27, 2017, 2:49am

@sam has deployed a workaround in Discourse for this problem; if you rebuild with the latest discourse_docker changes, redis and pg logs should go to files, rather than Docker, and the bug shouldn’t(!) be triggered.

codinghorror · December 27, 2017, 8:05am

Again fantastic work @mpalmer and @sam. Top notch open source citizenry.

omarfilip · December 28, 2017, 4:05am

Once Docker releases a good version, should this file be removed?

mpalmer · December 28, 2017, 4:09am

Yep, once the bug is fixed the pin should be removed.

Steven · December 29, 2017, 1:54am

Docker is on version 17.12.0-ce, maybe this is fixed

mpalmer · December 29, 2017, 2:34am

Yep, looks like it might have been fixed accidentally. There’s no indication that it was deliberately fixed in the bug report.

kdambekalns · January 3, 2018, 8:48am

We had that issue as well, and when freeing up some space didn’t fix it, I upgraded Docker to 17.12.0-ce. It worked the next (almost) four days, and then hung again after the backup. It’s been doing that the last two days again. See Neos Project if you’re interested.

Is this fixed for everybody else? Or do the issues continue for anyone?

pfaffman · January 3, 2018, 9:48am

I pinned several sites to 17.10 and upgraded another to 17.12 yesterday.

I’ve not had any problems since.

mpalmer · January 3, 2018, 11:11am

We’re going to need a lot more info if you’re after assistance tracking this down. Nature of the “hang”, logs of all shapes and sizes, that sort of thing.

yanokwa · January 3, 2018, 1:32pm

I had the same experience you had and I’ve gone back to 17.10.

kdambekalns · January 8, 2018, 8:18am

I know, but I was just curious if people with any of the symptoms mentioned in this topic still had issues. I am sick of being the only one whose problems are never solved as for everbody else.

Now, probably Discourse itself is reading this support forum and got scared, because since I posted here, it has not crashed for the last 120 hours. It seems 17.12 might indeed have fixed this.

What happened in those crashes: the Docker host is running fine, the app is up, but nginx returns a gateway timeout. As far as logs go: if there was anything interesting, I’d be glad. But there are no errors in the Discourse logs. That’s why I ended up here, the “Docker bug with long log lines” related to “backup does log long lines” explanation seemed a bulls-eye match.

mpalmer · January 8, 2018, 8:22am

There is an indication, in the unicorn logs, when the “long log lines” bug strikes – you end up with Redis::TimeoutError exceptions being raised. If you’re not seeing those, then it isn’t this bug, or you’re running an unfamiliar configuration which doesn’t present in the same manner.

kdambekalns · January 8, 2018, 8:42am

Ah, ok. Found that log now, and indeed on the day the last hang happened (2018-01-03) there are tons of Redis::TimeoutError lines in the unicorn.stderr.log.

E, [2018-01-02T03:43:02.934928 #386] ERROR -- : app error: Connection timed out (Redis::TimeoutError)
E, [2018-01-02T03:43:32.745829 #398] ERROR -- : app error: Connection timed out (Redis::TimeoutError)
E, [2018-01-02T03:43:32.995622 #386] ERROR -- : app error: Connection timed out (Redis::TimeoutError)
E, [2018-01-02T03:44:02.793192 #398] ERROR -- : app error: Connection timed out (Redis::TimeoutError)
E, [2018-01-02T03:44:03.038986 #386] ERROR -- : app error: Connection timed out (Redis::TimeoutError)
E, [2018-01-02T03:44:30.984465 #13850] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)
E, [2018-01-02T03:44:31.020092 #13854] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)
E, [2018-01-02T03:44:38.106718 #13865] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)
E, [2018-01-02T03:44:38.101883 #13869] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)
E, [2018-01-02T03:44:44.179293 #13880] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)
E, [2018-01-02T03:44:45.189178 #13886] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)
E, [2018-01-02T03:44:50.234502 #13894] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)
E, [2018-01-02T03:44:56.294789 #13910] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)
Failed to report error: Connection timed out 2 Error connecting to Redis on localhost:6379 (Redis::TimeoutError) subscribe failed, reconnecting in 1 second. Call stack ["/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:345:in `rescue in establish_connection'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:331:in `establish_connection'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:101:in `block in connect'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:293:in `with_reconnect'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:100:in `connect'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:276:in `with_socket_timeout'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/client.rb:133:in `call_loop'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/subscribe.rb:43:in `subscription'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis/subscribe.rb:12:in `subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis.rb:2765:in `_subscription'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis.rb:2143:in `block in subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis.rb:58:in `block in synchronize'", "/usr/local/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis.rb:58:in `synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.3/lib/redis.rb:2142:in `subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.0.2/lib/message_bus/backends/redis.rb:304:in `global_subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.0.2/lib/message_bus.rb:513:in `global_subscribe_thread'", "/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.0.2/lib/message_bus.rb:461:in `block in new_subscriber_thread'"]
E, [2018-01-02T03:45:04.359632 #13937] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)
E, [2018-01-02T03:45:12.451658 #13954] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)
E, [2018-01-02T03:45:18.529028 #13967] ERROR -- : Error connecting to Redis on localhost:6379 (Redis::TimeoutError) (Redis::CannotConnectError)

and so forth…

mpalmer · January 8, 2018, 8:57am

Oh dear… if you’re 100% sure you were running docker 17.12 at the time, including the containerd-shim (which is where the problem is), then Houston, we have a problem. The fact, though, that it’s stopped being a problem suggests that it’s either far harder to trigger the bug now, or alternately you kicking the container back to life was enough to cause the containerd-shim to be restarted with the 17.12 version, and now everything is hunky-dory forever more.

kdambekalns · January 8, 2018, 9:31am

Well… I updated Docker on 2017-12-29 using the Ubuntu package management. Whether or not that restarts everything cleanly and completely I can’t tell. That I might need to know is a prpoblem in itself…

Ok, but it might be that the containerd-shim (whatever that is) has been restarted in the updated version now and the bug is gone for good. Thanks for the help!

Topic		Replies	Views
Redis connection timed out Installation	30	9456	June 8, 2024
Error connecting to Redis Installation	17	6898	March 21, 2023
Issue with new update: debugging options Installation	16	152	August 30, 2024
Redis Error after upgrade Installation	14	1544	July 3, 2020
Redis Problems? (Forum broken after upgrade) Installation	34	3071	December 24, 2021

Discourse web interface becomes unresponsive a few minutes after starting

Related topics