Thanks!
I admit that this PostgreSQL is a whole new universe to me and I’m not familiar with it at all.
Can you confirm that logs are stored in /var/discourse/shared/standalone/log/var-log/postgres
or should I look for a specific error log somewhere?
I looked at the current
file in this folder, but couldn’t find anything suspicious, especially at the time shown by Discourse’s interface logs.
But it also is very possible that I don’t know what to look for.
edit : found this in the current
file:
2022-03-12 05:27:51.651 UTC [450818] discourse@discourse LOG: duration: 258.470 ms parse <unnamed>: SELECT 1 AS one FROM "permalinks" WHERE "permalinks"."url" = 'images/emoji/twitter/roll_eyes.png' LIMIT 1
2022-03-12 05:27:52.287 UTC [450818] discourse@discourse LOG: duration: 557.420 ms bind <unnamed>: SELECT 1 AS one FROM "permalinks" WHERE "permalinks"."url" = 'images/emoji/twitter/roll_eyes.png' LIMIT 1
2022-03-12 05:27:52.500 UTC [450818] discourse@discourse LOG: duration: 158.213 ms execute <unnamed>: SELECT 1 AS one FROM "permalinks" WHERE "permalinks"."url" = 'images/emoji/twitter/roll_eyes.png' LIMIT 1
2022-03-12 05:27:53.047 UTC [450818] discourse@discourse LOG: duration: 301.290 ms parse <unnamed>: SELECT 1 AS one FROM "permalinks" WHERE "permalinks"."url" = 'images/emoji/twitter/roll_eyes.png' LIMIT 1
2022-03-12 05:27:56.647 UTC [560] LOG: checkpointer process (PID 180496) was terminated by signal 9: Killed
2022-03-12 05:27:56.650 UTC [560] LOG: terminating any other active server processes
2022-03-12 05:27:56.652 UTC [455579] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.652 UTC [455579] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.652 UTC [455579] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.652 UTC [455580] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.652 UTC [455580] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.652 UTC [455580] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455573] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455573] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455573] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455560] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455560] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455560] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455476] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455476] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455476] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [180506] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [180506] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [180506] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455477] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455477] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455477] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [180499] WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [180499] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [180499] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455581] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455581] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455581] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.653 UTC [455574] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.653 UTC [455574] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.653 UTC [455574] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.654 UTC [427436] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.654 UTC [427436] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.654 UTC [427436] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.657 UTC [455341] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.657 UTC [455341] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.657 UTC [455341] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.661 UTC [455559] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.661 UTC [455559] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.661 UTC [455559] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.698 UTC [450818] discourse@discourse WARNING: terminating connection because of crash of another server process
2022-03-12 05:27:56.698 UTC [450818] discourse@discourse DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-03-12 05:27:56.698 UTC [450818] discourse@discourse HINT: In a moment you should be able to reconnect to the database and repeat your command.
2022-03-12 05:27:56.773 UTC [560] LOG: all server processes terminated; reinitializing
2022-03-12 05:27:56.957 UTC [455632] LOG: database system was interrupted; last known up at 2022-03-12 05:25:19 UTC
2022-03-12 05:27:56.957 UTC [455634] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.957 UTC [455633] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.958 UTC [455636] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.958 UTC [455635] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.966 UTC [455637] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.966 UTC [455638] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.966 UTC [455639] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.969 UTC [455640] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:56.977 UTC [455641] discourse@discourse FATAL: the database system is in recovery mode
2022-03-12 05:27:57.331 UTC [455632] LOG: database system was not properly shut down; automatic recovery in progress
2022-03-12 05:27:57.334 UTC [455632] LOG: redo starts at E/B99AC028
2022-03-12 05:27:57.371 UTC [455632] LOG: invalid record length at E/BA16D680: wanted 24, got 0
2022-03-12 05:27:57.371 UTC [455632] LOG: redo done at E/BA16D640
2022-03-12 05:27:57.467 UTC [560] LOG: database system is ready to accept connections
I supposed the most important lines are:
2022-03-12 05:27:56.647 UTC [560] LOG: checkpointer process (PID 180496) was terminated by signal 9: Killed
2022-03-12 05:27:56.650 UTC [560] LOG: terminating any other active server processes
Forum activity:
300000 posts to rebake (but the error can happens after 150000 or after 1000 rebaked posts, it’s kind of random.
Server’s specs:
CPX21 from Hetzner: https://www.hetzner.com/cloud-fr
- 3 vCPU
- 4 GB RAM
- 80 GB disk (50% free)
Also, I found that dmesg
command can shows additional info, here’s the end of this command:
[590461.105649] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=2f5e8cd98980f017bab02228d6af1cfbbd0068935aedea9ab3680470befa2030,mems_allowed=0,global_oom,task_memcg=/docker/2f5e8cd98980f017bab02228d6af1cfbbd0068935aedea9ab3680470befa2030,task=postmaster,pid=1135915,uid=105
[590461.105673] Out of memory: Killed process 1135915 (postmaster) total-vm:1172132kB, anon-rss:7244kB, file-rss:0kB, shmem-rss:839588kB, UID:105 pgtables:2104kB oom_score_adj:0
[590461.109673] oom_reaper: reaped process 1135915 (postmaster), now anon-rss:0kB, file-rss:0kB, shmem-rss:839588kB
[592120.454582] Process accounting resumed
[592121.145664] Process accounting resumed
Is this of any help?