Discourse Randomly Does Not run or Rebuild

Out of nowhere, Discourse is no longer wanting to run and is not even rebuilding using ./launcher rebuild app. I commented out all plugins too.

Here is the logs when I try to start it: https://codefile.io/f/8XUuOqyEDd

Here are the logs when I use ./launcher rebuild app. I see something about “failed listening on port 6379 (TCP) aborting” but I have nothing running on that port!

https://codefile.io/f/zxCBRzEOA9

I don’t think it’s related to your issue. This warning often (always?) appear during a rebuild.

error: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: Connection refused

I think your issue more likely comes from this.

This might give clues:

Would this be what is causing it to not work when I run it(without doing any ./launcher rebuild app)?

I stopped all other services on my server and updated to the latest Ubuntu LTS and it still shows this:

PG::ConnectionBad: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: Connection refused (PG::ConnectionBad)
        Is the server running locally and accepting connections on that socket?

which is what I would think the error is.

Swapping templates with 13 and even 15 did not solve the issue, which is what was shown in the referenced post.

Caused by:
PG::ConnectionBad: connection to server on socket “/var/run/postgresql/.s.PGSQL.5432” failed: No such file or directory (PG::ConnectionBad)
Is the server running locally and accepting connections on that socket?

timeout: down: postgres: 1s, normally up, want up

Seems like the database isn’t starting up correctly. The logs show it appears to occasionally start up properly, but only for a short time, so that could be a red herring.

ok: run: postgres: (pid 315501) 0s

The postgres logs could have some hint of the problem, especially when trying to start the app container.

tail -f shared/standalone/log/var-log/postgres/current
2 Likes

Did you do the PostgreSQL 15 update

I too think it’s about an unclean shutdown. If you’ve got a backup, what I would do is spin up a new vm and restore it. You can follow Move a Discourse site to another VPS with rsync and exclude postgres_*.

The alternative, which is your only option if you don’t have a backup, will be to figure out a bunch of stuff about postgres that you don’t want to learn about.

How can I access my backups if my forum is down(as in I cannot go to admin settings and download a backup)?

I also did not try to migrate anything, I have been using it as normal and updating via the web ui? Why would the database have an unclean shutdown??

I will provide the Postgres logs, one second

2025-03-22 00:30:44.110 UTC [4922] FATAL: lock file “postmaster.pid” is empty
2025-03-22 00:30:44.110 UTC [4922] HINT: Either another server is starting, or the lock file is the remnant of a previous server startup crash.
2025-03-22 00:30:45.127 UTC [4964] FATAL: lock file “postmaster.pid” is empty
2025-03-22 00:30:45.127 UTC [4964] HINT: Either another server is starting, or the lock file is the remnant of a previous server startup crash.
2025-03-22 00:30:46.151 UTC [4966] FATAL: lock file “postmaster.pid” is empty
2025-03-22 00:30:46.151 UTC [4966] HINT: Either another server is starting, or the lock file is the remnant of a previous server startup crash.
2025-03-22 00:30:47.168 UTC [4970] FATAL: lock file “postmaster.pid” is empty
2025-03-22 00:30:47.168 UTC [4970] HINT: Either another server is starting, or the lock file is the remnant of a previous server startup crash.
2025-03-22 00:30:48.192 UTC [4977] FATAL: lock file “postmaster.pid” is empty
2025-03-22 00:30:48.192 UTC [4977] HINT: Either another server is starting, or the lock file is the remnant of a previous server startup crash.

-rw------- 1 syslog kvm 0 Mar 18 19:48 /var/discourse/shared/standalone/postgres_data/postmaster.pid

This is where my lockfile is

They are in /var/discourse/shared/standalone/backups/default

If you follow the rsync instructions I linked earlier, you’ll get them.

It crashed or the server rebooted or something things happen.

The database is “migrated” from one set of tables (tables get added and changed) to another on most upgrades.

You might try to stop the container and delete that lock file

And look in PG_VERSION to see what version you have, since I think you tried changing the template.

Yes, I did try to change after I saw the error.

So, would I do rm /var/discourse/shared/standalone/postgres_data/postmaster.pid ? to delete the lockfile then try to rebuild

Also thank you for helping me with this

1 Like

I would do this command to delete the lockfile?

rm /var/discourse/shared/standalone/postgres_data/postmaster.pid was the solution, thank you!

4 Likes