I was prompted to upgrade today, and then required to do the build through launcher. Since then Discourse has been basically broken. Already logged in users can see the home page and select messages from there to display, but virtually all functions user and admin just hang with the throbber spinning. The production log is empty. I’ve tried stopping and restarting of course. Not even sure where to start looking. Ideas? Thanks!
UPDATE: As indicated in my various replies below, it appears that the root of the problem is that rdis is not accepting connections, and the rdis log extract below shows rdis apparently looping, each time starting with:
100 changes in 300 seconds. Saving…
and never reaching a “waiting for connection” state. Any suggestions on how to fix this issue would be much appreciated. Thank you.
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:274:in `block in global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:734:in `block in new_subscriber_thread'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:117:in `block in connect'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:269:in `block in process'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-2.3.3/lib/mini_profiler/profiling_methods.rb:85:in `block in profile_method'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:967:in `block in get'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:72:in `block in synchronize'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:274:in `block in global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:734:in `block in new_subscriber_thread'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:117:in `block in connect'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:269:in `block in process'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-2.3.3/lib/mini_profiler/profiling_methods.rb:85:in `block in profile_method'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:967:in `block in get'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:72:in `block in synchronize'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:274:in `block in global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:734:in `block in new_subscriber_thread'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:117:in `block in connect'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:269:in `block in process'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-2.3.3/lib/mini_profiler/profiling_methods.rb:85:in `block in profile_method'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:967:in `block in get'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:72:in `block in synchronize'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:274:in `block in global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:734:in `block in new_subscriber_thread'
3112:C 23 Dec 2021 01:56:54.912 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
3112:C 23 Dec 2021 01:56:54.912 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=3112, just started
3112:C 23 Dec 2021 01:56:54.912 # Configuration loaded
3112:M 23 Dec 2021 01:56:54.914 * monotonic clock: POSIX clock_gettime
3112:M 23 Dec 2021 01:56:55.105 * Running mode=standalone, port=6379.
3112:M 23 Dec 2021 01:56:55.105 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
3112:M 23 Dec 2021 01:56:55.105 # Server initialized
3112:M 23 Dec 2021 01:56:55.105 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
3112:M 23 Dec 2021 01:56:55.401 * Loading RDB produced by version 6.2.6
3112:M 23 Dec 2021 01:56:55.401 * RDB age 296 seconds
3112:M 23 Dec 2021 01:56:55.401 * RDB memory usage when created 4.86 Mb
3112:M 23 Dec 2021 01:56:55.634 # Done loading RDB, keys loaded: 2751, keys expired: 2.
3112:M 23 Dec 2021 01:56:55.634 * DB loaded from disk: 0.528 seconds
3112:M 23 Dec 2021 01:56:55.634 * Ready to accept connections
3112:M 23 Dec 2021 02:01:55.037 * 100 changes in 300 seconds. Saving...
3112:M 23 Dec 2021 02:01:55.038 * Background saving started by pid 3523
3523:C 23 Dec 2021 02:01:55.140 * DB saved on disk
3523:C 23 Dec 2021 02:01:55.142 * RDB: 0 MB of memory used by copy-on-write
3112:M 23 Dec 2021 02:01:55.239 * Background saving terminated with success
3112:M 23 Dec 2021 02:06:56.065 * 100 changes in 300 seconds. Saving...
3112:M 23 Dec 2021 02:06:56.065 * Background saving started by pid 3831
3831:C 23 Dec 2021 02:06:56.134 * DB saved on disk
3831:C 23 Dec 2021 02:06:56.136 * RDB: 0 MB of memory used by copy-on-write
3112:M 23 Dec 2021 02:06:56.166 * Background saving terminated with success
3112:M 23 Dec 2021 02:11:57.071 * 100 changes in 300 seconds. Saving...
3112:M 23 Dec 2021 02:11:57.071 * Background saving started by pid 4138
4138:C 23 Dec 2021 02:11:57.135 * DB saved on disk
4138:C 23 Dec 2021 02:11:57.137 * RDB: 0 MB of memory used by copy-on-write
3112:M 23 Dec 2021 02:11:57.172 * Background saving terminated with success
3112:M 23 Dec 2021 02:16:58.000 * 100 changes in 300 seconds. Saving...
3112:M 23 Dec 2021 02:16:58.001 * Background saving started by pid 4448
4448:C 23 Dec 2021 02:16:58.064 * DB saved on disk
4448:C 23 Dec 2021 02:16:58.066 * RDB: 0 MB of memory used by copy-on-write
3112:M 23 Dec 2021 02:16:58.101 * Background saving terminated with success
Rdis appears to be looping repeatedly with this and never reaching a state where it’s waiting for connections:
3112:M 23 Dec 2021 03:02:07.032 * 100 changes in 300 seconds. Saving...
3112:M 23 Dec 2021 03:02:07.033 * Background saving started by pid 7243
7243:C 23 Dec 2021 03:02:07.104 * DB saved on disk
7243:C 23 Dec 2021 03:02:07.105 * RDB: 0 MB of memory used by copy-on-write
3112:M 23 Dec 2021 03:02:07.134 * Background saving terminated with success
While we’re at it, perhaps also check disk space df -h /
And, if you don’t get any specific advice, another rebuild might be worth a try. Be sure to have a backup, and to have it downloaded too, just in case.
But really it feels like you need more specific advice.
Hi. I have a week’s worth or so of Discourse backups locally, though at this point I don’t know how to use them. Plenty of memory free (at least 1.5G) and free disk. I’ll try another rebuild now. Thanks.
Rebuilt. Bottom line is that it’s still broken the same way, though there were some changes I saw along the way, During the build I noticed:
139:C 23 Dec 2021 16:40:32.625 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
139:C 23 Dec 2021 16:40:32.625 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=139, just started
139:C 23 Dec 2021 16:40:32.626 # Configuration loaded
139:M 23 Dec 2021 16:40:32.627 * monotonic clock: POSIX clock_gettime
139:M 23 Dec 2021 16:40:32.627 # Warning: Could not create server TCP listening socket *:6379: bind: Address already in use
139:M 23 Dec 2021 16:40:32.628 # Failed listening on port 6379 (TCP), aborting.
Later I saw this, which was similar to the looping I saw earlier. However, it says used 1MB of memory for copy on write, and the redis log looping had said 0MB.
112:M 23 Dec 2021 16:46:08.703 * Background saving started by pid 715
715:C 23 Dec 2021 16:46:08.822 * DB saved on disk
715:C 23 Dec 2021 16:46:08.824 * RDB: 1 MB of memory used by copy-on-write
112:M 23 Dec 2021 16:46:08.905 * Background saving terminated with success
The current redis log:
4161:M 23 Dec 2021 16:55:11.867 * Loading RDB produced by version 6.2.6
4161:M 23 Dec 2021 16:55:11.867 * RDB age 281 seconds
4161:M 23 Dec 2021 16:55:11.867 * RDB memory usage when created 4.61 Mb
4161:M 23 Dec 2021 16:55:12.086 # Done loading RDB, keys loaded: 2743, keys expired: 0.
4161:M 23 Dec 2021 16:55:12.086 * DB loaded from disk: 0.265 seconds
4161:M 23 Dec 2021 16:55:12.086 * Ready to accept connections
4161:M 23 Dec 2021 17:00:12.088 * 100 changes in 300 seconds. Saving...
4161:M 23 Dec 2021 17:00:12.098 * Background saving started by pid 4612
4612:C 23 Dec 2021 17:00:12.189 * DB saved on disk
4612:C 23 Dec 2021 17:00:12.191 * RDB: 0 MB of memory used by copy-on-write
4161:M 23 Dec 2021 17:00:12.199 * Background saving terminated with success
I am seeing that 0MB copy-on-write notice again but I don’t know if that’s normal or not. I cannot open a telnet connection on that machine to localhost 6379, I get the error:
Unable to connect to remote host: Cannot assign requested address
The production log shows lots of blocks as before:
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:344:in `with_reconnect'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:116:in `connect'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:417:in `ensure_connected'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:269:in `block in process'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:356:in `logging'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:268:in `process'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:162:in `call'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-2.3.3/lib/mini_profiler/profiling_methods.rb:85:in `block in profile_method'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:967:in `block in get'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:72:in `block in synchronize'
/usr/local/lib/ruby/2.7.0/monitor.rb:202:in `synchronize'
/usr/local/lib/ruby/2.7.0/monitor.rb:202:in `mon_synchronize'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:72:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:966:in `get'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:366:in `process_global_backlog'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:274:in `block in global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:287:in `global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:786:in `global_subscribe_thread'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:734:in `block in new_subscriber_thread'
Job exception: no address for meta.discourse.org (Resolv::ResolvError)
That redis output is normal and it does not necessarily indicate that Redis is not listening. It’s a background save that always occurs every so-many minutes. It does not mean it’s “looping” instead of functioning. I think it’s actually a good sign that it is showing this output. So something else must be wrong. Maybe it’s on a different port, maybe it can’t be reached for some other reason.
For reference, this is redis starting in the middle of my most recent rebuild:
111:C 22 Dec 2021 20:36:16.773 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
111:C 22 Dec 2021 20:36:16.773 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=111, just started
111:C 22 Dec 2021 20:36:16.773 # Configuration loaded
111:M 22 Dec 2021 20:36:16.774 * monotonic clock: POSIX clock_gettime
111:M 22 Dec 2021 20:36:16.775 * Running mode=standalone, port=6379.
111:M 22 Dec 2021 20:36:16.776 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
111:M 22 Dec 2021 20:36:16.776 # Server initialized
111:M 22 Dec 2021 20:36:16.777 * Loading RDB produced by version 6.2.1
111:M 22 Dec 2021 20:36:16.777 * RDB age 22 seconds
111:M 22 Dec 2021 20:36:16.777 * RDB memory usage when created 19.89 Mb
111:M 22 Dec 2021 20:36:16.823 # Done loading RDB, keys loaded: 4451, keys expired: 9.
111:M 22 Dec 2021 20:36:16.823 * DB loaded from disk: 0.047 seconds
111:M 22 Dec 2021 20:36:16.823 * Ready to accept connections
slightly later I see
I, [2021-12-22T20:36:27.079511 #1] INFO -- : > cd /var/www/discourse && git reset --hard
139:C 22 Dec 2021 20:36:27.084 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
139:C 22 Dec 2021 20:36:27.086 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=139, just started
139:C 22 Dec 2021 20:36:27.086 # Configuration loaded
139:M 22 Dec 2021 20:36:27.087 * monotonic clock: POSIX clock_gettime
139:M 22 Dec 2021 20:36:27.087 # Warning: Could not create server TCP listening socket *:6379: bind: Address already in use
139:M 22 Dec 2021 20:36:27.087 # Failed listening on port 6379 (TCP), aborting.
Checking out files: 100% (26188/26188), done.
I, [2021-12-22T20:36:30.153161 #1] INFO -- : HEAD is now at eb82849c FIX: none and all categories filter (#14999)
My log shows a clutch of entries every 5 mins. Perhaps this is normal, and your forum troubles are not related to redis??
$ less /var/discourse/shared/standalone/log/var-log/redis/current
23478:C 23 Dec 2021 18:47:35.233 * DB saved on disk
23478:C 23 Dec 2021 18:47:35.234 * RDB: 1 MB of memory used by copy-on-write
52:M 23 Dec 2021 18:47:35.290 * Background saving terminated with success
52:M 23 Dec 2021 18:52:36.033 * 100 changes in 300 seconds. Saving...
52:M 23 Dec 2021 18:52:36.034 * Background saving started by pid 23804
23804:C 23 Dec 2021 18:52:36.177 * DB saved on disk
23804:C 23 Dec 2021 18:52:36.178 * RDB: 1 MB of memory used by copy-on-write
52:M 23 Dec 2021 18:52:36.235 * Background saving terminated with success
52:M 23 Dec 2021 18:57:37.085 * 100 changes in 300 seconds. Saving...
52:M 23 Dec 2021 18:57:37.086 * Background saving started by pid 24137
24137:C 23 Dec 2021 18:57:37.221 * DB saved on disk
24137:C 23 Dec 2021 18:57:37.222 * RDB: 1 MB of memory used by copy-on-write
52:M 23 Dec 2021 18:57:37.287 * Background saving terminated with success
52:M 23 Dec 2021 19:02:38.033 * 100 changes in 300 seconds. Saving...
52:M 23 Dec 2021 19:02:38.034 * Background saving started by pid 24466