Спасибо!
Признаюсь, что PostgreSQL для меня — целый новый мир, и я совершенно не знаком с ним.
Можете подтвердить, что логи хранятся в /var/discourse/shared/standalone/log/var-log/postgres, или мне стоит искать конкретный лог ошибок где-то ещё?
Я посмотрел файл current в этой папке, но не нашёл ничего подозрительного, особенно в то время, которое показывает интерфейс логов Discourse.
Хотя вполне возможно, что я просто не знаю, что именно искать.
редактирование: нашёл это в файле current:
2022-03-12 05:27:51.651 UTC [450818] discourse@discourse LOG: duration: 258.470 ms parse <unnamed>: SELECT 1 AS one FROM "permalinks" WHERE "permalinks"."url" = 'images/emoji/twitter/roll_eyes.png' LIMIT 1
2022-03-12 05:27:52.287 UTC [450818] discourse@discourse LOG: duration: 557.420 ms bind <unnamed>: SELECT 1 AS one FROM "permalinks" WHERE "permalinks"."url" = 'images/emoji/twitter/roll_eyes.png' LIMIT 1
2022-03-12 05:27:52.500 UTC [450818] discourse@discourse LOG: duration: 158.213 ms execute <unnamed>: SELECT 1 AS one FROM "permalinks" WHERE "permalinks"."url" = 'images/emoji/twitter/roll_eyes.png' LIMIT 1
2022-03-12 05:27:53.047 UTC [450818] discourse@discourse LOG: duration: 301.290 ms parse <unnamed>: SELECT 1 AS one FROM "permalinks" WHERE "permalinks"."url" = 'images/emoji/twitter/roll_eyes.png' LIMIT 1
2022-03-12 05:27:56.647 UTC [560] LOG: checkpointer process (PID 180496) was terminated by signal 9: Killed
2022-03-12 05:27:56.650 UTC [560] LOG: terminating any other active server processes
2022-03-12 05:27:56.652 UTC [455579] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.652 UTC [455579] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.652 UTC [455579] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.652 UTC [455580] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.652 UTC [455580] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.652 UTC [455580] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455573] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455573] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455573] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455560] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455560] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455560] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455476] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455476] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455476] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [180506] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [180506] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [180506] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455477] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455477] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455477] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [180499] WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [180499] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [180499] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455581] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455581] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455581] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455574] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455574] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455574] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.654 UTC [427436] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.654 UTC [427436] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.654 UTC [427436] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.657 UTC [455341] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.657 UTC [455341] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.657 UTC [455341] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.661 UTC [455559] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.661 UTC [455559] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.661 UTC [455559] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.698 UTC [450818] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.698 UTC [450818] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.698 UTC [450818] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.773 UTC [560] LOG: all server processes terminated; reinitializing
2022-03-12 05:27:56.957 UTC [455632] LOG: database system was interrupted; last known up at 2022-03-12 05:25:19 UTC
2022-03-12 05:27:56.957 UTC [455634] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.957 UTC [455633] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.958 UTC [455636] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.958 UTC [455635] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.966 UTC [455637] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.966 UTC [455638] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.966 UTC [455639] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.969 UTC [455640] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.977 UTC [455641] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:57.331 UTC [455632] LOG: database system was not properly shut down; automatic recovery in progress
2022-03-12 05:27:57.334 UTC [455632] LOG: redo starts at E/B99AC028
2022-03-12 05:27:57.371 UTC [455632] LOG: invalid record length at E/BA16D680: wanted 24, got 0
2022-03-12 05:27:57.371 UTC [455632] LOG: redo done at E/BA16D640
2022-03-12 05:27:57.467 UTC [560] LOG: database system is ready to accept connections
Я предполагаю, что самые важные строки это:
2022-03-12 05:27:56.647 UTC [560] LOG: checkpointer process (PID 180496) was terminated by signal 9: Killed
2022-03-12 05:27:56.650 UTC [560] LOG: terminating any other active server processes
Активность на форуме:
300 000 постов на перепечку (но ошибка может возникнуть после 150 000 или после 1000 перепечённых постов, это довольно случайно.
Характеристики сервера:
CPX21 от Hetzner: https://www.hetzner.com/cloud-fr
- 3 vCPU
- 4 ГБ ОЗУ
- 80 ГБ диска (50% свободно)
Также я обнаружил, что команда dmesg может показать дополнительную информацию, вот конец её вывода:
[590461.105649] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=2f5e8cd98980f017bab02228d6af1cfbbd0068935aedea9ab3680470befa2030,mems_allowed=0,global_oom,task_memcg=/docker/2f5e8cd98980f017bab02228d6af1cfbbd0068935aedea9ab3680470befa2030,task=postmaster,pid=1135915,uid=105
[590461.105673] Out of memory: Killed process 1135915 (postmaster) total-vm:1172132kB, anon-rss:7244kB, file-rss:0kB, shmem-rss:839588kB, UID:105 pgtables:2104kB oom_score_adj:0
[590461.109673] oom_reaper: reaped process 1135915 (postmaster), now anon-rss:0kB, file-rss:0kB, shmem-rss:839588kB
[592120.454582] Process accounting resumed
[592121.145664] Process accounting resumed
Это хоть чем-то поможет?