Troubleshooting Discourse crash?


(Philip Colmer) #1

One of our Discourse platforms crashed last night. A restart got it working again but I’d like to try and better understand what caused the problem, particularly if that helps the code improve.

Unfortunately, I can’t see anything in the logs that hints at Discourse actually crashing. The logs continue to show messages even after a site admin had reported that Discourse was down and all we were getting was 500 errors.

There are lots of warnings like this:

failed to deliver message, skipping #<struct MessageBus::Message global_id=321617, message_id=3795, channel="/site_settings", data={"process"=>"d9ded469-6c45-4cca-9dcf-3e18b949f182"}>
ex: no implicit conversion of nil into String backtrace: ["/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/connection/hiredis.rb:19:in `connect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/connection/hiredis.rb:19:in `connect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:336:in `establish_connection'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:101:in `block in connect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:293:in `with_reconnect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:100:in `connect'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:364:in `ensure_connected'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:221:in `block in process'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:306:in `logging'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:220:in `process'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:120:in `call'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis.rb:494:in `block in del'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis.rb:58:in `block in synchronize'", "/usr/local/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis.rb:58:in `synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis.rb:493:in `del'", "/var/www/discourse/lib/discourse_redis.rb:192:in `block in del'", "/var/www/discourse/lib/discourse_redis.rb:146:in `ignore_readonly'", "/var/www/discourse/lib/discourse_redis.rb:190:in `del'", "/var/www/discourse/lib/cache.rb:53:in `delete_entry'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/activesupport-4.2.7.1/lib/active_support/cache.rb:402:in `block in delete'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/activesupport-4.2.7.1/lib/active_support/cache.rb:547:in `block in instrument'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/activesupport-4.2.7.1/lib/active_support/notifications.rb:166:in `instrument'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/activesupport-4.2.7.1/lib/active_support/cache.rb:547:in `instrument'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/activesupport-4.2.7.1/lib/active_support/cache.rb:401:in `delete'", "/var/www/discourse/lib/site_setting_extension.rb:379:in `clear_cache!'", "/var/www/discourse/lib/site_setting_extension.rb:235:in `block in refresh!'", "/var/www/discourse/lib/site_setting_extension.rb:216:in `synchronize'", "/var/www/discourse/lib/site_setting_extension.rb:216:in `refresh!'", "/var/www/discourse/lib/site_setting_extension.rb:254:in `process_message'", "/var/www/discourse/lib/site_setting_extension.rb:242:in `block in ensure_listen_for_changes'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.2/lib/message_bus.rb:535:in `block (2 levels) in global_subscribe_thread'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.2/lib/message_bus.rb:549:in `each'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.2/lib/message_bus.rb:549:in `block in multi_each'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.2/lib/message_bus.rb:548:in `each'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.2/lib/message_bus.rb:548:in `multi_each'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.2/lib/message_bus.rb:533:in `block in global_subscribe_thread'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.2/lib/message_bus/backends/redis.rb:331:in `block (2 levels) in global_subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/subscribe.rb:45:in `block in subscription'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:141:in `block (3 levels) in call_loop'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:135:in `loop'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:135:in `block (2 levels) in call_loop'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:231:in `block (2 levels) in process'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:367:in `ensure_connected'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:221:in `block in process'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:306:in `logging'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:220:in `process'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:134:in `block in call_loop'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:280:in `with_socket_timeout'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/client.rb:133:in `call_loop'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/subscribe.rb:43:in `subscription'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/subscribe.rb:12:in `subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis.rb:2760:in `_subscription'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis.rb:2138:in `block in subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis.rb:58:in `block in synchronize'", "/usr/local/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis.rb:58:in `synchronize'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis.rb:2137:in `subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.2/lib/message_bus/backends/redis.rb:304:in `global_subscribe'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.2/lib/message_bus.rb:513:in `global_subscribe_thread'", "/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/message_bus-2.0.2/lib/message_bus.rb:461:in `block in new_subscriber_thread'"]

and quite a few “fatal” messages like this:

TypeError (no implicit conversion of nil into String)
/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.1/lib/redis/connection/hiredis.rb:19:in `connect'

Does the fact that it was still logging suggest that it wasn’t Discourse itself but, perhaps, nginx that was the cause of the failure? Where do I need to look to pin this down further?

Thanks.


Losing redis connection