在发现 3.3.0-beta1 发布后,我立即通过 Web 界面更新了我的 Discourse 实例。
然而,在更新过程中,Web 界面的日志停滞了十多分钟,没有进一步的输出(我记得最后输出的是一连串不断增长的省略号?可能是这样,我有点不确定)。大约 2 小时后,我从云平台检查了服务器状态,怀疑它已冻结,因此从云平台执行了软重启。
重启后,我立即从命令行运行了 Discourse 备份,下载了本地的 backup 和 app.yml,然后完全重新安装了 Discourse(当然是最新版本)。之后,我上传了备份并从命令行启动了恢复过程。
恢复成功了,但现在我的 Discourse 面临严重的性能问题。以前,正常使用期间的 CPU 使用率不超过 10%,但现在即使在非高峰时段,CPU 使用率也会飙升至 30% 左右,磁盘读取也相对较高。更糟糕的是,有时服务器会莫名其妙地崩溃,磁盘读取达到每秒 1900 次(这是我的云服务器的极限),CPU 在等待状态下超过 40%。网页无法加载,显示连接超时。目前,我正在运行 vmstat 和 top,但不幸的是,我没有保留输出。我记得交换 IO 几乎为零,表明纯粹是磁盘读取。阻塞线程的数量超过了 100。
我怀疑这次失败的更新可能对备份中的数据造成了损坏,而不是软件本身。有什么方法可以——嗯,我不确定?——刷新或删除某些缓存或类似的操作?或者……再次运行更新?(毕竟,Discourse 更新相当频繁,几乎可以随时更新。)
作为临时解决方案,我安装了一个软件看门狗,以便在高负载期间自动重启。然而,这最终不是一个长期的解决方案,而且我在这里没有找到类似的问题;显然,这不是 Discourse 软件本身的问题。我想知道如何解决这个问题。
如果您需要我在服务器高负载期间执行某些命令来检查其状态,请随时提出。我会尽力保持我的 SSH 连接并获取这些数据,而无需重启。
您的 Sidekiq 队列看起来怎么样?这可能只是重新烘焙所有帖子,一旦积压处理完毕,负载就会逐渐恢复。也就是说,情况应该是暂时的。
我最近也遇到过几次这种情况。每次我都是手动重启,然后执行命令行重建。这显然不是长久之计,而且有风险。
这是您的机器内存不足的迹象。考虑增加交换空间来应对峰值。
我现在发现我需要一台 4GB 的机器加上 3GB 的交换空间才能执行在线更新:您有多少内存?
嗯……实际上,他在恢复后不久并没有遇到极高的负载,我想这是正常的。但在我的情况中,恢复后负载降至 2% 左右时,我在网页界面上进行了一些编辑,突然在执行一个操作后页面停止加载。检查云服务器平台后,发现 CPU 和磁盘读取量非常高。我不认为这是重新烘焙所有帖子的一个问题。
由于我的服务器目前处于正常状态,这是 Sidekiq 界面:
Robert:
你有多少内存?
我有 2GB 物理内存和 2GB 交换空间。看起来交换空间是在 Discourse 安装过程中自动启用的?无论如何,我没有手动配置它。
我现在手动将交换空间添加到了总共 6GB。
好的,这是您的问题。
根据我最近的经验,看起来 Web 编译过程更加消耗资源,您可以尝试使用更大的交换空间或迁移到更大的服务器。
顺便说一句,在我之前尝试通过 Web 界面执行备份时,日志有时会卡住。现在,我从命令行执行所有备份和还原(自动备份除外,它们从未失败并且运行正常)。我想知道这是否是一个常见问题?是否有执行更新的命令?我将来是否应该继续通过命令行执行更新?这是否会潜在地提高成功率?
在日常使用中,我也注意到与之前相比,网页加载延迟显著增加,这很可能不是网络问题,而是性能问题。以下是加载主页时的性能数据:
The server freezed again. (watchdog is stopped manually to get full log without reboot) Here’s the top and vmstat output:
Note that the top output here hasn’t updated much since the server freeze, unlike vmstat, so it may not reflect the latest data.
top - 21:53:16 up 2:19, 3 users, load average: 19.20, 4.89, 1.90
Tasks: 164 total, 1 running, 163 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.7 us, 7.2 sy, 0.0 ni, 0.0 id, 91.8 wa, 0.0 hi, 0.4 si, 0.0 st
MiB Mem : 1668.6 total, 75.9 free, 1473.1 used, 119.6 buff/cache
MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 36.8 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
92 root 20 0 0 0 0 S 2.8 0.0 0:08.01 kswapd0
2504 root 10 -10 123496 6932 0 D 1.6 0.4 2:59.44 AliYunDunMonito
90 root 0 -20 0 0 0 I 1.0 0.0 0:01.10 kworker/1:1H-kblockd
1320 root 20 0 1236124 2116 0 S 0.8 0.1 0:00.96 containerd-shim
2493 root 10 -10 87652 1744 0 S 0.6 0.1 0:44.69 AliYunDun
1008 root 20 0 1276372 2160 0 S 0.5 0.1 0:43.62 argusagent
725 root 20 0 689584 3860 0 S 0.3 0.2 0:08.16 aliyun-service
1964 message+ 20 0 74800 12512 0 D 0.3 0.7 0:19.97 redis-server
1973 www-data 20 0 56840 4872 472 D 0.3 0.3 0:01.36 nginx
2081 admin 20 0 519788 263284 4 S 0.3 15.4 0:07.88 ruby
2214 admin 25 5 5338856 495356 0 S 0.3 29.0 0:31.95 ruby
2227 admin 20 0 5246956 484488 4 S 0.3 28.4 0:33.31 ruby
2467 systemd+ 20 0 215196 15052 11348 D 0.3 0.9 0:00.45 postmaster
785 root 20 0 42256 748 0 S 0.2 0.0 0:07.58 AliYunDunUpdate
805 root 20 0 1798676 11832 0 S 0.2 0.7 0:02.24 containerd
1027 root 20 0 2281464 29296 0 S 0.2 1.7 0:01.96 dockerd
2056 systemd+ 20 0 67872 2192 0 D 0.2 0.1 0:00.39 postmaster
2243 root 20 0 17204 5068 2592 S 0.2 0.3 0:00.34 sshd
2555 root 20 0 10496 2736 1916 R 0.2 0.2 0:11.41 top
9998 admin 20 0 5057740 313028 4 S 0.2 18.3 0:04.83 ruby
11658 admin 20 0 10944 252 0 D 0.2 0.0 0:00.09 sleep
1 root 20 0 166776 6568 2808 D 0.1 0.4 0:01.42 systemd
22 root 20 0 0 0 0 S 0.1 0.0 0:00.31 ksoftirqd/1
149 root 0 -20 0 0 0 I 0.1 0.0 0:00.17 kworker/0:1H-kblockd
707 systemd+ 20 0 16260 2296 1212 D 0.1 0.1 0:00.16 systemd-network
723 root 20 0 19424 984 0 S 0.1 0.1 0:01.58 assist_daemon
1965 root 20 0 6680 280 0 D 0.1 0.0 0:00.05 cron
1974 www-data 20 0 55912 3996 472 D 0.1 0.2 0:00.98 nginx
2054 systemd+ 20 0 213244 6344 4384 S 0.1 0.4 0:02.02 postmaster
2055 systemd+ 20 0 213780 3208 924 D 0.1 0.2 0:00.12 postmaster
2270 root 20 0 17204 3880 1404 D 0.1 0.2 0:00.67 sshd
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 71456 624 134980 0 0 11302 4 2209 3766 1 1 96 2 0
0 1 0 72756 1452 132412 0 0 15250 12 2418 3969 1 2 94 4 0
0 0 0 67724 1244 137444 0 0 4485 2 2083 3700 1 1 98 1 0
0 0 0 80388 928 124660 0 0 26388 10 2650 4018 1 2 92 4 0
0 0 0 78256 696 126228 0 0 14752 6 2286 3792 1 1 95 3 0
0 0 0 72252 1680 130784 0 0 40334 41 2938 4212 1 2 89 7 0
0 0 0 74812 712 128588 0 0 15904 34 2309 3839 1 1 95 3 0
2 0 0 71120 856 132224 0 0 54282 5 3292 4382 1 3 85 10 0
0 0 0 73668 600 128232 0 0 30135 35 2795 4212 1 2 92 5 0
0 0 0 65160 1524 135636 0 0 30239 6 2842 4247 1 2 90 7 0
0 0 0 74480 1416 126484 0 0 11761 8 2254 3793 1 2 95 2 0
1 0 0 80096 1188 120752 0 0 37110 2 2907 4163 1 4 88 7 0
0 0 0 77880 688 122928 0 0 8439 5 2151 3719 1 1 97 2 0
0 8 0 76744 1256 122536 0 0 110986 14 4410 4860 1 6 34 59 0
0 13 0 71776 244 128496 0 0 126111 21 4386 4699 1 7 0 91 0
1 30 0 71860 356 127916 0 0 125040 21 4634 7331 1 7 0 91 0
3 31 0 77448 204 122688 0 0 125980 10 4167 8370 1 5 0 94 0
3 44 0 68184 168 132192 0 0 150011 0 4296 9015 0 7 0 93 0
1 44 0 74812 216 124700 0 0 204040 0 4949 10108 1 6 0 93 0
6 35 0 66588 244 133160 0 0 131782 2 6238 12786 1 7 0 93 0
1 39 0 68156 556 130996 0 0 144440 11 4608 9622 1 8 0 92 0
0 71 0 72256 212 126520 0 0 169879 12 4780 10242 1 10 0 89 0
0 91 0 66752 112 132476 0 0 542955 18 7422 17564 0 52 0 48 0
6 89 0 70324 172 129516 0 0 404754 21 6033 14379 0 52 0 48 0
3 87 0 60468 156 138060 0 0 499604 1 10640 26065 0 56 0 44 0
5 91 0 63456 152 135588 0 0 725747 11 10827 25806 0 55 0 45 0
4 101 0 69596 168 129132 0 0 558872 8 7755 18745 0 56 0 44 0
6 93 0 66516 156 132772 0 0 394003 3 12549 30622 0 54 0 46 0
4 94 0 62872 152 135976 0 0 656057 0 7790 18800 0 52 0 48 0
2 95 0 57072 156 141552 0 0 347776 0 10390 25943 0 56 0 44 0
4 97 0 66308 164 132964 0 0 641920 0 9963 24307 0 55 0 45 0
13 90 0 75620 152 123836 0 0 658883 0 7870 19260 0 55 0 45 0
2 96 0 66244 152 131496 0 0 764233 0 16078 40498 0 55 0 45 0
1 102 0 60156 224 137440 0 0 542241 0 10243 24560 0 54 0 46 0
13 90 0 62488 164 135548 0 0 671301 0 13771 34230 0 55 0 45 0
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
3 101 0 61812 100 136048 0 0 548615 0 10328 26305 0 55 0 45 0
2 102 0 61412 92 135928 0 0 1003845 0 17283 42751 0 56 0 44 0
8 98 0 60408 152 136896 0 0 662925 0 11369 29130 0 55 0 45 0
2 104 0 68292 152 129552 0 0 628286 0 13772 34122 0 56 0 44 0
2 99 0 65352 152 132436 0 0 494863 5 8766 21559 0 54 0 46 0
10 94 0 58344 96 138800 0 0 669390 0 9294 22968 0 56 0 44 0
3 95 0 61180 100 135444 0 0 598622 0 13342 32430 0 57 0 43 0
8 96 0 59444 160 137584 0 0 720255 0 10812 25735 0 57 0 43 0
14 89 0 66844 160 130072 0 0 665390 0 17496 43187 0 56 0 44 0
1 98 0 64668 152 132756 0 0 498328 0 7262 18622 0 55 0 45 0
1 98 0 61848 172 135060 0 0 438272 0 7693 19947 0 56 0 44 0
3 102 0 60636 152 136548 0 0 1020531 0 17162 43981 0 55 0 45 0
2 93 0 55096 164 141908 0 0 595678 0 14379 37088 0 55 0 45 0
9 100 0 61704 152 134692 0 0 368380 0 10621 27539 0 55 0 45 0
2 104 0 63784 148 132620 0 0 767905 8 11090 28866 0 56 0 44 0
2 110 0 60124 232 135636 0 0 882479 0 14778 38194 0 55 0 45 0
3 109 0 64632 156 131448 0 0 489138 0 11521 30264 0 56 0 44 0
2 107 0 65852 36 130592 0 0 1214409 0 17572 45932 0 55 0 45 0
2 97 0 60260 96 135668 0 0 559811 0 14592 37198 0 56 0 44 0
1 103 0 56064 104 139668 0 0 514522 0 11059 26834 0 57 0 43 0
7 97 0 62624 160 133492 0 0 646888 4 7819 19552 0 57 0 43 0
8 102 0 65404 152 130172 0 0 526268 0 11623 29396 0 56 0 44 0
I’ll keep the freezing state without rebooting. If you need me to execute some commands on the server to check its status during high loads, feel free to ask. I’ll do my best to maintain my SSH connection and obtain all data without rebooting.
other data from cloud server platform:
Ed_S
(Ed S)
2024 年3 月 29 日 16:10
9
tumbleweed:
AliYunDunMonito
这看起来很可疑——可能是也可能不是原因。某个东西(可能)正在进行大量的磁盘读取。AliYunDun 似乎是云提供商的工具。AliYunDunMonitor(可能)卡在磁盘等待中,正在以提升的优先级运行,已经消耗了很多 CPU 时间。
我看到有人在某处写了一个脚本 来终止此类进程。语气很愤怒。(我无法判断该脚本是否安全,或者是否是个好主意。)
但也有可能某个 Discourse 进程正在进行所有的磁盘读取。
也许可以试试
apt-get install iotop
iotop -b -d 15 -P -o
然后观察它,直到出现冻结。
好的,这是冻结时 iotop 的输出:
Total DISK READ: 7.92 M/s | Total DISK WRITE: 11.70 K/s
Current DISK READ: 12.14 M/s | Current DISK WRITE: 17.29 K/s
PID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
1 be/4 root 6.12 K/s 0.00 B/s ?unavailable? init noibrs
300 be/3 root 0.00 B/s 2.93 K/s ?unavailable? [jbd2/vda3-8]
374 be/3 root 10.64 K/s 0.00 B/s ?unavailable? systemd-journald
709 be/4 systemd- 3.99 K/s 0.00 B/s ?unavailable? systemd-resolved
723 be/4 root 51.07 K/s 0.00 B/s ?unavailable? assist_daemon
725 be/4 root 856.83 K/s 272.40 B/s ?unavailable? aliyun-service
770 be/4 root 63.31 K/s 0.00 B/s ?unavailable? python3 -Es /usr/sbin/tuned -l -P
785 be/4 root 12.77 K/s 0.00 B/s ?unavailable? AliYunDunUpdate
805 be/4 root 389.98 K/s 0.00 B/s ?unavailable? containerd
1008 be/4 root 242.60 K/s 544.80 B/s ?unavailable? argusagent
1027 be/4 root 690.31 K/s 0.00 B/s ?unavailable? dockerd -H fd:// --containerd=/run/containerd/containerd.sock
1320 be/4 root 241.01 K/s 0.00 B/s ?unavailable? containerd-shim-runc-v2 -namespace moby -id 6e9880833995c5e2a295ed6571129387036824d304e09f43d5e1333ebd7fdbd5 -address /run/containerd/containerd.sock
1953 be/4 root 1906.79 B/s 0.00 B/s ?unavailable? runsvdir -P /etc/service
1962 be/4 admin 85.12 K/s 0.00 B/s ?unavailable? bash config/unicorn_launcher -E production -c config/unicorn.conf.rb
1964 be/4 messageb 611.83 K/s 0.00 B/s ?unavailable? redis-server *:6379
1973 be/4 www-data 530.17 K/s 272.40 B/s ?unavailable? nginx: worker process
2054 be/4 systemd- 8.78 K/s 7.45 K/s ?unavailable? postgres: 13/main: walwriter
2056 be/4 systemd- 102.42 K/s 0.00 B/s ?unavailable? postgres: 13/main: stats collector
2081 be/4 admin 817.20 B/s 0.00 B/s ?unavailable? unicorn master -E production -c config/unicorn.conf.rb
2493 ?dif root 132.21 K/s 0.00 B/s ?unavailable? AliYunDun
2504 be/2 root 740.32 K/s 0.00 B/s ?unavailable? AliYunDunMonitor
45686 be/4 root 47.62 K/s 0.00 B/s ?unavailable? sshd: root@pts/0
45816 be/4 root 36.98 K/s 0.00 B/s ?unavailable? top
46146 be/4 root 357.52 K/s 0.00 B/s ?unavailable? python3 /usr/sbin/iotop -b -d 15 -P -o
46160 be/4 root 267.34 K/s 0.00 B/s ?unavailable? sshd: root@pts/2
46232 be/4 root 174.24 K/s 0.00 B/s ?unavailable? vmstat 5
51903 be/4 admin 1569.22 K/s 272.40 B/s ?unavailable? unicorn worker[0] -E production -c config/unicorn.conf.rb
51972 be/4 admin 175.84 K/s 0.00 B/s ?unavailable? unicorn worker[1] -E production -c config/unicorn.conf.rb
60365 be/4 root 272.40 B/s 0.00 B/s ?unavailable? [kworker/u4:0-writeback]
63081 be/4 root 272.40 B/s 0.00 B/s ?unavailable? [kworker/u4:4-events_unbound]
63825 be/4 systemd- 673.02 K/s 0.00 B/s ?unavailable? postgres: 13/main: discourse discourse [local] idle
64191 be/4 root 23.14 K/s 0.00 B/s ?unavailable? -bash
实际上,我通过反复按 F5 键刷新主页几次,成功导致网站崩溃……所以,我确实认为这很可能是 Discourse 的问题。
top 输出现在:
top - 13:36:47 up 18:03, 4 users, load average: 86.10, 47.87, 19.63
Tasks: 174 total, 4 running, 170 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.2 us, 21.1 sy, 0.0 ni, 0.0 id, 76.3 wa, 0.0 hi, 0.4 si, 0.0 st
MiB Mem : 1668.6 total, 69.6 free, 1413.6 used, 185.4 buff/cache
MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 36.9 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
92 root 20 0 0 0 0 S 37.8 0.0 100:17.66 kswapd0
1320 root 20 0 1236124 2320 0 S 4.0 0.1 4:18.47 containerd-shim
1027 root 20 0 2281464 29160 0 S 2.6 1.7 2:25.38 dockerd
805 root 20 0 1798676 11844 0 S 2.5 0.7 2:28.22 containerd
2504 root 10 -10 129972 11660 0 D 2.1 0.7 27:34.39 AliYunDunMonito
90 root 0 -20 0 0 0 R 1.5 0.0 1:58.75 kworker/1:1H-kblockd
2493 root 10 -10 88144 2244 0 D 1.1 0.1 10:49.21 AliYunDun
51989 root 20 0 1318960 8980 0 S 0.8 0.5 0:46.15 snapd
1008 root 20 0 1276372 2584 0 S 0.7 0.2 11:16.56 argusagent
46146 root 20 0 22320 9516 1908 D 0.5 0.6 0:23.57 iotop
46160 root 20 0 17200 4968 2496 D 0.3 0.3 0:03.41 sshd
45686 root 20 0 17200 5188 2716 D 0.3 0.3 0:04.21 sshd
785 root 20 0 42324 876 0 S 0.2 0.1 3:41.24 AliYunDunUpdate
45816 root 20 0 10500 3352 2544 R 0.2 0.2 0:26.01 top
51903 admin 20 0 845708 458296 4 S 0.2 26.8 0:32.36 ruby
53938 admin 25 5 5285240 378040 0 S 0.2 22.1 0:33.50 ruby
723 root 20 0 19424 996 0 S 0.2 0.1 1:57.31 assist_daemon
725 root 20 0 689584 3800 0 S 0.2 0.2 3:06.25 aliyun-service
1964 message+ 20 0 74800 13944 0 D 0.2 0.8 2:51.80 redis-server
1974 www-data 20 0 57696 6136 772 R 0.2 0.4 0:36.37 nginx
1975 www-data 20 0 53508 1520 72 S 0.2 0.1 0:45.76 nginx
2053 systemd+ 20 0 213244 6952 4988 D 0.2 0.4 0:27.47 postmaster
51972 admin 20 0 5139692 329104 4 S 0.2 19.3 0:08.21 ruby
64309 systemd+ 20 0 213244 1956 0 D 0.2 0.1 0:00.15 postmaster
22 root 20 0 0 0 0 S 0.1 0.0 0:09.66 ksoftirqd/1
149 root 0 -20 0 0 0 I 0.1 0.0 0:06.26 kworker/0:1H-kblockd
1973 www-data 20 0 56576 5088 768 D 0.1 0.3 0:35.23 nginx
2052 systemd+ 20 0 213468 34408 32340 D 0.1 2.0 0:30.59 postmaster
2055 systemd+ 20 0 213780 3244 932 D 0.1 0.2 0:29.32 postmaster
2056 systemd+ 20 0 68132 2256 0 D 0.1 0.1 0:15.54 postmaster
2081 admin 20 0 585324 280488 4 S 0.1 16.4 0:57.44 ruby
下面是 vmstat 输出:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 80348 884 230624 0 0 538 6 1980 3661 1 1 99 0 0
0 0 0 78232 2780 226576 0 0 17644 20 2510 4063 3 1 93 2 0
3 0 0 93580 176 186012 0 0 34986 19 3329 4751 13 4 78 6 0
0 29 0 82988 192 177628 0 0 127805 10 4930 6456 5 7 34 54 0
0 30 0 85520 204 175124 0 0 120935 6 3768 6887 1 5 0 94 0
3 37 0 89408 232 170836 0 0 122472 35 3876 7157 1 7 0 92 0
10 80 0 74004 220 186372 0 0 476674 36 4532 10212 0 35 0 64 0
11 89 0 71564 160 190284 0 0 388222 0 11098 27886 0 53 0 47 0
0 100 0 73508 160 188484 0 0 209539 4 4726 11652 0 32 0 68 0
我不是 Linux 专家,但 kswapd0 进程不是负责管理内存和交换吗?为什么当 vmstat 中的 swpd、si 和 so 都为 0 时,它的 sy CPU 使用率如此之高?我完全感到困惑。
Ed_S
(Ed S)
2024 年3 月 30 日 08:36
11
您有 2G 的交换空间但没有交换使用和交换活动,这非常可疑。我怀疑是您的内核设置。这是我的设置 - 请展示您所有的设置,请不要挑三拣四!
head /proc/sys/vm/*
==> /proc/sys/vm/admin_reserve_kbytes <==
8192
==> /proc/sys/vm/block_dump <==
0
head: cannot open '/proc/sys/vm/compact_memory' for reading: Permission denied
==> /proc/sys/vm/compact_unevictable_allowed <==
1
==> /proc/sys/vm/dirty_background_bytes <==
0
==> /proc/sys/vm/dirty_background_ratio <==
10
==> /proc/sys/vm/dirty_bytes <==
0
==> /proc/sys/vm/dirty_expire_centisecs <==
3000
==> /proc/sys/vm/dirty_ratio <==
20
==> /proc/sys/vm/dirty_writeback_centisecs <==
500
==> /proc/sys/vm/dirtytime_expire_seconds <==
43200
==> /proc/sys/vm/drop_caches <==
0
==> /proc/sys/vm/extfrag_threshold <==
500
==> /proc/sys/vm/hugepages_treat_as_movable <==
0
==> /proc/sys/vm/hugetlb_shm_group <==
0
==> /proc/sys/vm/laptop_mode <==
0
==> /proc/sys/vm/legacy_va_layout <==
0
==> /proc/sys/vm/lowmem_reserve_ratio <==
256 256 32 1
==> /proc/sys/vm/max_map_count <==
65530
==> /proc/sys/vm/memory_failure_early_kill <==
0
==> /proc/sys/vm/memory_failure_recovery <==
1
==> /proc/sys/vm/min_free_kbytes <==
45056
==> /proc/sys/vm/min_slab_ratio <==
5
==> /proc/sys/vm/min_unmapped_ratio <==
1
==> /proc/sys/vm/mmap_min_addr <==
65536
==> /proc/sys/vm/mmap_rnd_bits <==
28
==> /proc/sys/vm/mmap_rnd_compat_bits <==
8
==> /proc/sys/vm/nr_hugepages <==
0
==> /proc/sys/vm/nr_hugepages_mempolicy <==
0
==> /proc/sys/vm/nr_overcommit_hugepages <==
0
==> /proc/sys/vm/numa_stat <==
1
==> /proc/sys/vm/numa_zonelist_order <==
Node
==> /proc/sys/vm/oom_dump_tasks <==
1
==> /proc/sys/vm/oom_kill_allocating_task <==
0
==> /proc/sys/vm/overcommit_kbytes <==
0
==> /proc/sys/vm/overcommit_memory <==
1
==> /proc/sys/vm/overcommit_ratio <==
50
==> /proc/sys/vm/page-cluster <==
3
==> /proc/sys/vm/panic_on_oom <==
0
==> /proc/sys/vm/percpu_pagelist_fraction <==
0
==> /proc/sys/vm/stat_interval <==
1
==> /proc/sys/vm/stat_refresh <==
==> /proc/sys/vm/swappiness <==
60
==> /proc/sys/vm/user_reserve_kbytes <==
29305
==> /proc/sys/vm/vfs_cache_pressure <==
100
==> /proc/sys/vm/watermark_scale_factor <==
10
==> /proc/sys/vm/zone_reclaim_mode <==
0
编辑:也请运行这些并发布输出
swapon
free
uname -a
df -T
1 个赞
root@iZj6cgi365ov99veqodfgnZ:~# head /proc/sys/vm/*
==> /proc/sys/vm/admin_reserve_kbytes <==
8192
==> /proc/sys/vm/compaction_proactiveness <==
20
head: cannot open '/proc/sys/vm/compact_memory' for reading: Permission denied
==> /proc/sys/vm/compact_unevictable_allowed <==
1
==> /proc/sys/vm/dirty_background_bytes <==
0
==> /proc/sys/vm/dirty_background_ratio <==
10
==> /proc/sys/vm/dirty_bytes <==
0
==> /proc/sys/vm/dirty_expire_centisecs <==
3000
==> /proc/sys/vm/dirty_ratio <==
30
==> /proc/sys/vm/dirtytime_expire_seconds <==
43200
==> /proc/sys/vm/dirty_writeback_centisecs <==
500
head: cannot open '/proc/sys/vm/drop_caches' for reading: Permission denied
==> /proc/sys/vm/extfrag_threshold <==
500
==> /proc/sys/vm/hugetlb_shm_group <==
0
==> /proc/sys/vm/laptop_mode <==
0
==> /proc/sys/vm/legacy_va_layout <==
0
==> /proc/sys/vm/lowmem_reserve_ratio <==
256 256 32 0 0
==> /proc/sys/vm/max_map_count <==
65530
==> /proc/sys/vm/memory_failure_early_kill <==
0
==> /proc/sys/vm/memory_failure_recovery <==
1
==> /proc/sys/vm/min_free_kbytes <==
45056
==> /proc/sys/vm/min_slab_ratio <==
5
==> /proc/sys/vm/min_unmapped_ratio <==
1
==> /proc/sys/vm/mmap_min_addr <==
65536
==> /proc/sys/vm/mmap_rnd_bits <==
28
==> /proc/sys/vm/mmap_rnd_compat_bits <==
8
==> /proc/sys/vm/nr_hugepages <==
0
==> /proc/sys/vm/nr_hugepages_mempolicy <==
0
==> /proc/sys/vm/nr_overcommit_hugepages <==
0
==> /proc/sys/vm/numa_stat <==
1
==> /proc/sys/vm/numa_zonelist_order <==
Node
==> /proc/sys/vm/oom_dump_tasks <==
1
==> /proc/sys/vm/oom_kill_allocating_task <==
0
==> /proc/sys/vm/overcommit_kbytes <==
0
==> /proc/sys/vm/overcommit_memory <==
0
==> /proc/sys/vm/overcommit_ratio <==
50
==> /proc/sys/vm/page-cluster <==
3
==> /proc/sys/vm/page_lock_unfairness <==
5
==> /proc/sys/vm/panic_on_oom <==
0
==> /proc/sys/vm/percpu_pagelist_high_fraction <==
0
==> /proc/sys/vm/stat_interval <==
1
==> /proc/sys/vm/stat_refresh <==
==> /proc/sys/vm/swappiness <==
0
==> /proc/sys/vm/unprivileged_userfaultfd <==
0
==> /proc/sys/vm/user_reserve_kbytes <==
50778
==> /proc/sys/vm/vfs_cache_pressure <==
100
==> /proc/sys/vm/watermark_boost_factor <==
15000
==> /proc/sys/vm/watermark_scale_factor <==
10
==> /proc/sys/vm/zone_reclaim_mode <==
0
root@iZj6cgi365ov99veqodfgnZ:~# swapon
NAME TYPE SIZE USED PRIO
/swapfile file 2G 0B -2
root@iZj6cgi365ov99veqodfgnZ:~# free
total used free shared buff/cache available
Mem: 1708636 984352 79376 38344 644908 490004
Swap: 2097148 0 2097148
root@iZj6cgi365ov99veqodfgnZ:~# uname -a
Linux iZj6cgi365ov99veqodfgnZ 5.15.0-86-generic #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
root@iZj6cgi365ov99veqodfgnZ:~# df -T
Filesystem Type 1K-blocks Used Available Use% Mounted on
tmpfs tmpfs 170864 1164 169700 1% /run
/dev/vda3 ext4 61540412 19791160 39020044 34% /
tmpfs tmpfs 854316 0 854316 0% /dev/shm
tmpfs tmpfs 5120 0 5120 0% /run/lock
/dev/vda2 vfat 201615 6186 195429 4% /boot/efi
tmpfs tmpfs 170860 4 170856 1% /run/user/0
overlay overlay 61540412 19791160 39020044 34% /var/lib/docker/overlay2/7754829d0ad68c8acc8b50ed96ae87d8d882a996b87fe0b6821827e527487b62/merged
Ed_S
(Ed S)
2024 年3 月 30 日 09:50
13
谢谢,这太好了。不知何故,您的 swappiness 设置为零,这几乎肯定就是您的大部分问题所在。
但是有三个设置值得正确配置:一个是 swappiness,另外两个在 MKJ 的主观 Discourse 部署配置 中提到。
/proc/sys/vm/overcommit_memory
/sys/kernel/mm/transparent_hugepage/enabled
/proc/sys/vm/swappiness
您可以一次性设置这些值,但还应确保它们在启动时已设置 - 请参阅 MJK 的帖子,并在覆盖它们之前检查这些文件。
我怀疑在启动时发生了什么事情将 swappiness 设置为零。可能是您的托管服务提供商试图阻止将数据写入 SSD。但您必须有交换空间,否则至少必须升级到 4G RAM。
也许可以尝试
egrep -r swappiness /etc/
来查找设置它的位置。
本论坛其他地方有建议将 swappiness 设置为 10 - 我更喜欢将其保留为默认值 60。
1 个赞
抱歉……我不太擅长 Linux。您能具体教我如何设置吗?
我从未手动更改过任何系统设置。但是,我认为托管服务提供商可能限制了 swap,因为此服务器似乎运行在 HDD 上(从最高 2k IOPS 判断)。
执行这些操作可以永久修改您提到的设置吗?
运行:
echo 'sys.kernel.mm.transparent_hugepage.enabled=never' > /etc/sysctl.d/10-huge-pages.conf
echo 'vm.overcommit_memory=1' > /etc/sysctl.d/90-vm_overcommit_memory.conf
并在 /etc/sysctl.conf 中更改 vm.swapiness = 60
现在:(更改所有设置后我已重启。以下命令在重启后运行)
root@iZj6cgi365ov99veqodfgnZ:~# head /proc/sys/vm/*
==> /proc/sys/vm/admin_reserve_kbytes <==
8192
==> /proc/sys/vm/compaction_proactiveness <==
20
head: cannot open '/proc/sys/vm/compact_memory' for reading: Permission denied
==> /proc/sys/vm/compact_unevictable_allowed <==
1
==> /proc/sys/vm/dirty_background_bytes <==
0
==> /proc/sys/vm/dirty_background_ratio <==
10
==> /proc/sys/vm/dirty_bytes <==
0
==> /proc/sys/vm/dirty_expire_centisecs <==
3000
==> /proc/sys/vm/dirty_ratio <==
30
==> /proc/sys/vm/dirtytime_expire_seconds <==
43200
==> /proc/sys/vm/dirty_writeback_centisecs <==
500
head: cannot open '/proc/sys/vm/drop_caches' for reading: Permission denied
==> /proc/sys/vm/extfrag_threshold <==
500
==> /proc/sys/vm/hugetlb_shm_group <==
0
==> /proc/sys/vm/laptop_mode <==
0
==> /proc/sys/vm/legacy_va_layout <==
0
==> /proc/sys/vm/lowmem_reserve_ratio <==
256 256 32 0 0
==> /proc/sys/vm/max_map_count <==
65530
==> /proc/sys/vm/memory_failure_early_kill <==
0
==> /proc/sys/vm/memory_failure_recovery <==
1
==> /proc/sys/vm/min_free_kbytes <==
45056
==> /proc/sys/vm/min_slab_ratio <==
5
==> /proc/sys/vm/min_unmapped_ratio <==
1
==> /proc/sys/vm/mmap_min_addr <==
65536
==> /proc/sys/vm/mmap_rnd_bits <==
28
==> /proc/sys/vm/mmap_rnd_compat_bits <==
8
==> /proc/sys/vm/nr_hugepages <==
0
==> /proc/sys/vm/nr_hugepages_mempolicy <==
0
==> /proc/sys/vm/nr_overcommit_hugepages <==
0
==> /proc/sys/vm/numa_stat <==
1
==> /proc/sys/vm/numa_zonelist_order <==
Node
==> /proc/sys/vm/oom_dump_tasks <==
1
==> /proc/sys/vm/oom_kill_allocating_task <==
0
==> /proc/sys/vm/overcommit_kbytes <==
0
==> /proc/sys/vm/overcommit_memory <==
1
==> /proc/sys/vm/overcommit_ratio <==
50
==> /proc/sys/vm/page-cluster <==
3
==> /proc/sys/vm/page_lock_unfairness <==
5
==> /proc/sys/vm/panic_on_oom <==
0
==> /proc/sys/vm/percpu_pagelist_high_fraction <==
0
==> /proc/sys/vm/stat_interval <==
1
==> /proc/sys/vm/stat_refresh <==
==> /proc/sys/vm/swappiness <==
60
==> /proc/sys/vm/unprivileged_userfaultfd <==
0
==> /proc/sys/vm/user_reserve_kbytes <==
50778
==> /proc/sys/vm/vfs_cache_pressure <==
100
==> /proc/sys/vm/watermark_boost_factor <==
15000
==> /proc/sys/vm/watermark_scale_factor <==
10
==> /proc/sys/vm/zone_reclaim_mode <==
0
root@iZj6cgi365ov99veqodfgnZ:~# cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
看起来运行得很好!非常感谢大家的帮助!
2 个赞
Ed_S
(Ed S)
2024 年3 月 30 日 12:38
15
这进展不错!我推荐
# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
system
(system)
关闭
2024 年4 月 29 日 12:39
16
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.