Redis 问题?(升级后论坛损坏)

我今天被提示升级,然后需要通过启动器进行构建。从那时起,Discourse 就基本损坏了。已登录用户可以看到主页并从中选择消息进行显示,但几乎所有的用户和管理员功能都会挂起,并显示旋转的指示器。生产日志为空。我当然尝试过停止和重新启动。甚至不知道从哪里开始查找。有什么想法吗?谢谢!

**更新:**正如我在下面的各种回复中所指出的,问题根源似乎是 rdis 未接受连接,并且下面的 rdis 日志摘录显示 rdis 似乎在循环,每次都以以下内容开始:

300 秒内 100 次更改。正在保存…

并且从未达到“等待连接”状态。非常感谢关于如何解决此问题的任何建议。谢谢。

1 个赞

之前的日志显示了各种块,例如:

/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:274:in `block in global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:734:in `block in new_subscriber_thread'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:117:in `block in connect'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:269:in `block in process'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-2.3.3/lib/mini_profiler/profiling_methods.rb:85:in `block in profile_method'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:967:in `block in get'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:72:in `block in synchronize'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:274:in `block in global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:734:in `block in new_subscriber_thread'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:117:in `block in connect'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:269:in `block in process'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-2.3.3/lib/mini_profiler/profiling_methods.rb:85:in `block in profile_method'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:967:in `block in get'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:72:in `block in synchronize'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:274:in `block in global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:734:in `block in new_subscriber_thread'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:117:in `block in connect'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:269:in `block in process'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-2.3.3/lib/mini_profiler/profiling_methods.rb:85:in `block in profile_method'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:967:in `block in get'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:72:in `block in synchronize'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:274:in `block in global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:734:in `block in new_subscriber_thread'
1 个赞

完全重启系统未能解决问题。完全重启后情况完全相同。

1 个赞

unicorn.stderr.log 显示许多与连接到 Redis 相关的错误,例如:

连接 Redis 时出错:连接到 localhost:6379 上的 Redis 时出错 (Errno::EADDRNOTAVAIL)

1 个赞

这是当前的 redis 日志:

3112:C 23 Dec 2021 01:56:54.912 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
3112:C 23 Dec 2021 01:56:54.912 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=3112, just started
3112:C 23 Dec 2021 01:56:54.912 # Configuration loaded
3112:M 23 Dec 2021 01:56:54.914 * monotonic clock: POSIX clock_gettime
3112:M 23 Dec 2021 01:56:55.105 * Running mode=standalone, port=6379.
3112:M 23 Dec 2021 01:56:55.105 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
3112:M 23 Dec 2021 01:56:55.105 # Server initialized
3112:M 23 Dec 2021 01:56:55.105 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
3112:M 23 Dec 2021 01:56:55.401 * Loading RDB produced by version 6.2.6
3112:M 23 Dec 2021 01:56:55.401 * RDB age 296 seconds
3112:M 23 Dec 2021 01:56:55.401 * RDB memory usage when created 4.86 Mb
3112:M 23 Dec 2021 01:56:55.634 # Done loading RDB, keys loaded: 2751, keys expired: 2.
3112:M 23 Dec 2021 01:56:55.634 * DB loaded from disk: 0.528 seconds
3112:M 23 Dec 2021 01:56:55.634 * Ready to accept connections
3112:M 23 Dec 2021 02:01:55.037 * 100 changes in 300 seconds. Saving...
3112:M 23 Dec 2021 02:01:55.038 * Background saving started by pid 3523
3523:C 23 Dec 2021 02:01:55.140 * DB saved on disk
3523:C 23 Dec 2021 02:01:55.142 * RDB: 0 MB of memory used by copy-on-write
3112:M 23 Dec 2021 02:01:55.239 * Background saving terminated with success
3112:M 23 Dec 2021 02:06:56.065 * 100 changes in 300 seconds. Saving...
3112:M 23 Dec 2021 02:06:56.065 * Background saving started by pid 3831
3831:C 23 Dec 2021 02:06:56.134 * DB saved on disk
3831:C 23 Dec 2021 02:06:56.136 * RDB: 0 MB of memory used by copy-on-write
3112:M 23 Dec 2021 02:06:56.166 * Background saving terminated with success
3112:M 23 Dec 2021 02:11:57.071 * 100 changes in 300 seconds. Saving...
3112:M 23 Dec 2021 02:11:57.071 * Background saving started by pid 4138
4138:C 23 Dec 2021 02:11:57.135 * DB saved on disk
4138:C 23 Dec 2021 02:11:57.137 * RDB: 0 MB of memory used by copy-on-write
3112:M 23 Dec 2021 02:11:57.172 * Background saving terminated with success
3112:M 23 Dec 2021 02:16:58.000 * 100 changes in 300 seconds. Saving...
3112:M 23 Dec 2021 02:16:58.001 * Background saving started by pid 4448
4448:C 23 Dec 2021 02:16:58.064 * DB saved on disk
4448:C 23 Dec 2021 02:16:58.066 * RDB: 0 MB of memory used by copy-on-write
3112:M 23 Dec 2021 02:16:58.101 * Background saving terminated with success
1 个赞

Rdis 似乎在重复循环,从未达到等待连接的状态:

3112:M 23 Dec 2021 03:02:07.032 * 100 changes in 300 seconds. Saving...
3112:M 23 Dec 2021 03:02:07.033 * Background saving started by pid 7243
7243:C 23 Dec 2021 03:02:07.104 * DB saved on disk
7243:C 23 Dec 2021 03:02:07.105 * RDB: 0 MB of memory used by copy-on-write
3112:M 23 Dec 2021 03:02:07.134 * Background saving terminated with success
2 个赞

也许值得检查一下内存:

free
vmstat 5 5 
uptime

顺便说一下,也许也该检查一下磁盘空间
df -h /

而且,如果你得不到任何具体的建议,或许可以再试一次重建。请务必做好备份,并将其下载好,以防万一。

但说实话,感觉你需要更具体的建议。

2 个赞

您好。我本地有大约一周的 Discourse 备份,但目前我不知道如何使用它们。内存充足(至少 1.5G)且磁盘空间也足够。我现在将尝试重新构建。谢谢。

1 个赞

重新构建。底线是它仍然以相同的方式损坏,尽管在此过程中我看到了一些变化。在构建过程中,我注意到:

139:C 23 Dec 2021 16:40:32.625 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
139:C 23 Dec 2021 16:40:32.625 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=139, just started
139:C 23 Dec 2021 16:40:32.626 # Configuration loaded
139:M 23 Dec 2021 16:40:32.627 * monotonic clock: POSIX clock_gettime
139:M 23 Dec 2021 16:40:32.627 # Warning: Could not create server TCP listening socket *:6379: bind: Address already in use
139:M 23 Dec 2021 16:40:32.628 # Failed listening on port 6379 (TCP), aborting.

稍后我看到这个,它与我之前看到的循环相似。但是,它说写时复制使用了 1MB 内存,而 redis 日志循环说的是 0MB。

112:M 23 Dec 2021 16:46:08.703 * Background saving started by pid 715
715:C 23 Dec 2021 16:46:08.822 * DB saved on disk
715:C 23 Dec 2021 16:46:08.824 * RDB: 1 MB of memory used by copy-on-write
112:M 23 Dec 2021 16:46:08.905 * Background saving terminated with success

当前的 redis 日志:

4161:M 23 Dec 2021 16:55:11.867 * Loading RDB produced by version 6.2.6
4161:M 23 Dec 2021 16:55:11.867 * RDB age 281 seconds
4161:M 23 Dec 2021 16:55:11.867 * RDB memory usage when created 4.61 Mb
4161:M 23 Dec 2021 16:55:12.086 # Done loading RDB, keys loaded: 2743, keys expired: 0.
4161:M 23 Dec 2021 16:55:12.086 * DB loaded from disk: 0.265 seconds
4161:M 23 Dec 2021 16:55:12.086 * Ready to accept connections
4161:M 23 Dec 2021 17:00:12.088 * 100 changes in 300 seconds. Saving...
4161:M 23 Dec 2021 17:00:12.098 * Background saving started by pid 4612
4612:C 23 Dec 2021 17:00:12.189 * DB saved on disk
4612:C 23 Dec 2021 17:00:12.191 * RDB: 0 MB of memory used by copy-on-write
4161:M 23 Dec 2021 17:00:12.199 * Background saving terminated with success

我再次看到那个 0MB 写时复制通知,但我不知道这是否正常。我无法在该机器上通过 telnet 连接到 localhost 6379,我收到错误:
无法连接到远程主机:无法分配请求的地址

生产日志显示大量阻塞,与之前一样:

/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:344:in `with_reconnect'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:116:in `connect'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:417:in `ensure_connected'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:269:in `block in process'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:356:in `logging'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:268:in `process'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis/client.rb:162:in `call'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-2.3.3/lib/mini_profiler/profiling_methods.rb:85:in `block in profile_method'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:967:in `block in get'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:72:in `block in synchronize'
/usr/local/lib/ruby/2.7.0/monitor.rb:202:in `synchronize'
/usr/local/lib/ruby/2.7.0/monitor.rb:202:in `mon_synchronize'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:72:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/redis-4.5.1/lib/redis.rb:966:in `get'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:366:in `process_global_backlog'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:274:in `block in global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus/backends/redis.rb:287:in `global_subscribe'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:786:in `global_subscribe_thread'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/message_bus-3.3.8/lib/message_bus.rb:734:in `block in new_subscriber_thread'
Job exception: no address for meta.discourse.org (Resolv::ResolvError)
1 个赞

该 Redis 输出是正常的,并不一定表示 Redis 没有在侦听。这是每隔几分钟就会发生的后台保存。它并不意味着它在“循环”而不是在运行。我认为它显示此输出实际上是一个好兆头。所以一定还有别的问题。也许它在不同的端口上,也许由于其他原因无法访问。

2 个赞

非常感谢。对于所有这些块被记录下来以及命令挂起,还有其他可能性吗?

1 个赞

看起来已经有其他东西在使用该端口了。请考虑重启。

1 个赞

要查看哪个进程正在监听端口,请尝试
sudo lsof -n -i -P | egrep LISTEN

1 个赞

从 ps 看来,redis 正在运行,并且想要监听 6379 端口

/usr/bin/redis-server *:6379

但 lsof 没有显示任何实际监听该端口的进程。

1 个赞

作为参考,这是我最近一次重建过程中 Redis 启动时的日志:

111:C 22 Dec 2021 20:36:16.773 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
111:C 22 Dec 2021 20:36:16.773 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=111, just started
111:C 22 Dec 2021 20:36:16.773 # Configuration loaded
111:M 22 Dec 2021 20:36:16.774 * monotonic clock: POSIX clock_gettime
111:M 22 Dec 2021 20:36:16.775 * Running mode=standalone, port=6379.
111:M 22 Dec 2021 20:36:16.776 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
111:M 22 Dec 2021 20:36:16.776 # Server initialized
111:M 22 Dec 2021 20:36:16.777 * Loading RDB produced by version 6.2.1
111:M 22 Dec 2021 20:36:16.777 * RDB age 22 seconds
111:M 22 Dec 2021 20:36:16.777 * RDB memory usage when created 19.89 Mb
111:M 22 Dec 2021 20:36:16.823 # Done loading RDB, keys loaded: 4451, keys expired: 9.
111:M 22 Dec 2021 20:36:16.823 * DB loaded from disk: 0.047 seconds
111:M 22 Dec 2021 20:36:16.823 * Ready to accept connections

稍后我看到:

I, [2021-12-22T20:36:27.079511 #1]  INFO -- : cd /var/www/discourse & git reset --hard
139:C 22 Dec 2021 20:36:27.084 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
139:C 22 Dec 2021 20:36:27.086 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=139, just started
139:C 22 Dec 2021 20:36:27.086 # Configuration loaded
139:M 22 Dec 2021 20:36:27.087 * monotonic clock: POSIX clock_gettime
139:M 22 Dec 2021 20:36:27.087 # Warning: Could not create server TCP listening socket *:6379: bind: Address already in use
139:M 22 Dec 2021 20:36:27.087 # Failed listening on port 6379 (TCP), aborting.
Checking out files: 100% (26188/26188), done.
I, [2021-12-22T20:36:30.153161 #1]  INFO -- : HEAD is now at eb82849c FIX: none and all categories filter (#14999)
1 个赞

是的,这和我今天早上看到的构建差不多。但那个端口上实际上没有任何东西在监听。真是奇怪。

1 个赞

奇怪的是,我的设置运行正常,我有

root     26269  0.0  0.0   2160     0 ?        Ss   Dec22   0:00 runsv redis
root     26277  0.0  0.0   2304    24 ?        S    Dec22   0:00 svlogd /var/log/redis
uuidd    26278  0.2  0.4  83816  4088 ?        Sl   Dec22   3:31 /usr/bin/redis-server *:6379

然而,我也没有看到(从 lsof,容器内或外部)任何东西在监听该端口。

1 个赞

这有点令人鼓舞。但又有点不是……

1 个赞

我的日志显示每 5 分钟有一批条目。这也许是正常的,您的论坛问题与 redis 无关??

$ less /var/discourse/shared/standalone/log/var-log/redis/current
23478:C 23 Dec 2021 18:47:35.233 * DB 已保存到磁盘
23478:C 23 Dec 2021 18:47:35.234 * RDB:copy-on-write 使用了 1 MB 内存
52:M 23 Dec 2021 18:47:35.290 * 后台保存成功终止
52:M 23 Dec 2021 18:52:36.033 * 300 秒内有 100 次更改。正在保存...
52:M 23 Dec 2021 18:52:36.034 * 后台保存由进程 ID 23804 启动
23804:C 23 Dec 2021 18:52:36.177 * DB 已保存到磁盘
23804:C 23 Dec 2021 18:52:36.178 * RDB:copy-on-write 使用了 1 MB 内存
52:M 23 Dec 2021 18:52:36.235 * 后台保存成功终止
52:M 23 Dec 2021 18:57:37.085 * 300 秒内有 100 次更改。正在保存...
52:M 23 Dec 2021 18:57:37.086 * 后台保存由进程 ID 24137 启动
24137:C 23 Dec 2021 18:57:37.221 * DB 已保存到磁盘
24137:C 23 Dec 2021 18:57:37.222 * RDB:copy-on-write 使用了 1 MB 内存
52:M 23 Dec 2021 18:57:37.287 * 后台保存成功终止
52:M 23 Dec 2021 19:02:38.033 * 300 秒内有 100 次更改。正在保存...
52:M 23 Dec 2021 19:02:38.034 * 后台保存由进程 ID 24466 启动
1 个赞

谢谢。不过,这似乎意味着我回到了原点,论坛也坏了……哎哟。

1 个赞