FATAL: lock file "postmaster.pid" is empty

First time I have seen the following error when trying to rebuild.

2022-10-04 14:39:49.780 UTC [1700] FATAL:  lock file "postmaster.pid" is empty
2022-10-04 14:39:49.780 UTC [1700] HINT:  Either another server is starting, or the lock file is the remnant of a previous server startup crash.

I can obviously read the hint but not sure how to proceed. Can anyone offer insight?

When is this happening? Is this a standard install?

Standard install and after executing ./launcher rebuild app

Maybe try a

 ./launcher start app

Did this work before?

Is it an error or a warning. Did you try opening in your browser?

What does

 docker ps

say

Response to ./launcher start app:

57c2a0746e93
Nothing to do, your container has already started!

And then in the browser I get 502 Bad Gateway.

docker ps output

CONTAINER ID   IMAGE                 COMMAND        CREATED        STATUS          PORTS                                                                                                                 NAMES
57c2a0746e93   local_discourse/app   "/sbin/boot"   6 months ago   Up 16 minutes   0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp, 0.0.0.0:5432->5432/tcp, :::5432->5432/tcp   app

That’s odd. I think I might reboot and rebuild again.

Or maybe

   ./launcher stop app; ./launcher rebuild app

You’re running an old container, not one that you just built (created 6 months ago).

And maybe there were some other errors in the rebuild that you didn’t notice.

Same result

2022-10-04 15:26:43.452 UTC [1699] FATAL:  lock file "postmaster.pid" is empty
2022-10-04 15:26:43.452 UTC [1699] HINT:  Either another server is starting, or the lock file is the remnant of a previous server startup crash.

There is not enough data here to debug.

This is happening cause the build process thinks PG is already running so maybe something about the PG upgrade process is astray. Can you include full logs of launcher (scrubbing out passwords) so we can see what is up.

Maybe it will help to look at a system which is working correctly. I see my lockfile here:

# ls -l /var/discourse/shared/standalone/postgres_data/postmaster.pid
-rw------- 1 systemd-resolve input 92 Nov 15 16:20 /var/discourse/shared/standalone/postgres_data/postmaster.pid

and Nov 15 is the date I last started the app. If I enter the app I can see the postgres processes:

# cd /var/discourse/
# ./launcher enter app
x86_64 arch detected.
# ps auxfc|egrep -1 postm
root        45  0.0  0.0   2332     0 ?        S    Nov15   0:00      \_ svlogd
postgres    48  0.0  0.1 213160  1784 ?        S    Nov15   0:27      \_ postmaster
postgres    67  0.0  2.6 213380 26924 ?        Ss   Nov15   0:34          \_ postmaster
postgres    68  0.0  0.4 213292  4236 ?        Ss   Nov15   0:15          \_ postmaster
postgres    69  0.0  0.1 213160  1068 ?        Ss   Nov15   3:44          \_ postmaster
postgres    70  0.0  0.1 213840  1520 ?        Ss   Nov15   0:16          \_ postmaster
postgres    71  0.0  0.0  68184   380 ?        Ss   Nov15   0:56          \_ postmaster
postgres    72  0.0  0.0 213716   468 ?        Ss   Nov15   0:00          \_ postmaster
postgres    92  0.0  0.0 225364   324 ?        Ss   Nov15   0:01          \_ postmaster
postgres   176  0.0  0.1 217944  1484 ?        Ss   Nov15   0:01          \_ postmaster
postgres  9126  0.0  0.7 215052  7336 ?        Ss   Nov16   0:19          \_ postmaster
postgres  1574  0.0  5.7 223540 58300 ?        Ss   17:28   0:00          \_ postmaster
postgres  1973  0.0  3.3 221032 33960 ?        Ss   17:34   0:00          \_ postmaster
postgres  2320  0.1  3.5 218080 36120 ?        Ss   17:39   0:00          \_ postmaster
postgres  2321  0.1  2.9 218068 29928 ?        Ss   17:39   0:00          \_ postmaster
postgres  2336  0.0  1.4 215052 14340 ?        Ss   17:40   0:00          \_ postmaster
# exit

If I stopped the app, I’d expect to see no lockfile at that location, and no postgres processes running. (I would, of course, need to run the ps command directly on the host, because the container would no longer be running.)

In your situation, I think that’s what I’d do first: stop the app and check that no postgres processes are running. It seems possible you have two instances running which are colliding with one another.

It’s unlikely, but also possible the disk has filled and that’s why the lockfile is empty. Or perhaps there’s a permissions problem somehow.

Edit: inside the container, the lockfile has a different location and ownership:

# ./launcher enter app
x86_64 arch detected.
# ls -l /shared/postgres_data/postmaster.pid
-rw------- 1 postgres postgres 92 Nov 15 16:20 /shared/postgres_data/postmaster.pid
# exit
logout
# 

As Sam notes, we’d need to see more information.

1 Like