Cannot rebuild following site failure: "postgres already running"


(Dave Shaw) #1

My discourse crashed, so I tried rebuilding and I am getting the following error message. I was on 1.2.0beta4 before it went down.

I’ve rebooted the VM twice and I am getting the same problem:

Any ideas?

The full output of a rebuild now is:

/var/discourse$ sudo ./launcher rebuild app
WARNING: No swap limit support
Updating discourse docker
Already up-to-date.
Stopping old container
Error response from daemon: No such container: 8fc10a0a1c682060ef7ef3ff06945abc1bc9cd7e88336299eb8975e652bd9890
2015/01/15 23:15:53 Error: failed to stop one or more containers
Calculated ENV: -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GC_MALLOC_LIMIT=40000000 -e RUBY_HEAP_MIN_SLOTS=800000 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e HOME=/root -e DISCOURSE_DEVELOPER_EMAILS=hidded@example.org -e DISCOURSE_HOSTNAME=tolguild.co.uk -e DISCOURSE_SMTP_ADDRESS=smtp.mandrillapp.com -e DISCOURSE_SMTP_PORT=587 -e DISCOURSE_SMTP_USER_NAME=hidden@example.org -e DISCOURSE_SMTP_PASSWORD=hidden
cd /pups && git pull && /pups/bin/pups --stdin
Already up-to-date.
I, [2015-01-15T23:15:59.292757 #42]  INFO -- : Loading --stdin
I, [2015-01-15T23:15:59.305246 #42]  INFO -- : > mkdir -p /shared/postgres_run
I, [2015-01-15T23:15:59.307805 #42]  INFO -- :
I, [2015-01-15T23:15:59.312513 #42]  INFO -- : > chown postgres:postgres /shared/postgres_run
I, [2015-01-15T23:15:59.314954 #42]  INFO -- :
I, [2015-01-15T23:15:59.315604 #42]  INFO -- : > chmod 775 /shared/postgres_run
I, [2015-01-15T23:15:59.317583 #42]  INFO -- :
I, [2015-01-15T23:15:59.322184 #42]  INFO -- : > rm -fr /var/run/postgresql
I, [2015-01-15T23:15:59.324303 #42]  INFO -- :
I, [2015-01-15T23:15:59.325098 #42]  INFO -- : > ln -s /shared/postgres_run /var/run/postgresql
I, [2015-01-15T23:15:59.326968 #42]  INFO -- :
I, [2015-01-15T23:15:59.327703 #42]  INFO -- : > socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGSQL.5432 || exit 0 && echo postgres already running stop container ; exit 1
I, [2015-01-15T23:15:59.336875 #42]  INFO -- : postgres already running stop container



FAILED
--------------------
RuntimeError: socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGSQL.5432 || exit 0 && echo postgres already running stop container ; exit 1 failed with return #
Location of failure: /pups/lib/pups/exec_command.rb:105:in `spawn'
exec failed with the params "socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGSQL.5432 || exit 0 && echo postgres already running stop container ; exit 1"
fe8f9c6922355ce819ce32e35b11f0f87cced5cb7ffaff6f115a2cfc9f0e4608
FAILED TO BOOTSTRAP

Cheers,
Dave


(Dave Shaw) #2

As per another topic, docker ps shows the following:

/$ docker ps
CONTAINER ID        IMAGE                        COMMAND             CREATED             STATUS              PORTS                                                            NAMES
570f65df6a38        local_discourse/app:latest   "/sbin/runit"       3 months ago        Up 25 hours         0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp, 0.0.0.0:2222->22/tcp   determined_pike

So I ran docker kill 570f65df6a38.

It now appears to be rebuilding…


(Jeff Atwood) #3

Yes, the relevant line is

2015/01/15 23:15:53 Error: failed to stop one or more containers

It looks like Docker was having trouble stopping your container.

Definitely update Docker to latest if it is not already at latest.


(Dave Shaw) #4

Thanks Jeff.

After killing the docker process, the rebuild still failed, so I ran the following:

sudo apt-get update && sudo apt-get dist-upgrade

From this thread here, and this time the rebuild failed.

Thanks again for your help, this Linux / Docker / Ruby / Postgres setup is well out of my comfort zone, but your guys are really helpful, so I keep struggling on and learning all the way :smile:.


(Jeff Atwood) #5

No, you need to update Docker itself. There is another command to force Docker to update to latest. I believe it is apt-get lxc-docker or similar. There is at least one howto here on updating Docker.


(Dave Shaw) #6

Ah, I didn’t know :blush:.

git pull says everything is up to date, so I guess whatever I’ve done has updated everything.


(Jeff Atwood) #7

What does docker --version say? Should be 1.4.1 or later I believe.


(Dave Shaw) #8

Mine reports:

Docker version 1.2.0, build fa7b24f

I used this post by Sam for the “how to update”.

I’ve found a few threads (here and here) on the net, but if you could provide the best approach it would be appreciated.


(Jeff Atwood) #9

As mentioned in a number of topics here

apt-get update
apt-get dist-upgrade lxc-docker

Your Docker is out of date.

Then rebuild.


(Dave Shaw) #10

OK that’s done it:

Docker version 1.4.1, build 5bc2ff8

Cheers Jeff.

I guess I need to keep more up to date that I have been.


(oblio) #11

I’m hitting this with Docker 1.7.1 and the latest version of Discourse. I want to rebuild because I get this in the performance reports thread:

Report is only available in latest image, please run:
cd /var/discourse
./launcher rebuild app

The error is:


I, [2015-09-16T13:19:11.708621 #39]  INFO -- : > socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGSQL.5432 || exit 0 && echo postgres already running stop container ; exit 1
I, [2015-09-16T13:19:11.721860 #39]  INFO -- : postgres already running stop container



FAILED
--------------------
RuntimeError: socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGSQL.5432 || exit 0 && echo postgres already running stop container ; exit 1 failed with return #<Process::Status: pid 46 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:105:in `spawn'
exec failed with the params "socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGSQL.5432 || exit 0 && echo postgres already running stop container ; exit 1"
323124ae8ceb4b35d3e926876f605fa71d5e5fe308a1ab18b9d1f8072b3fff6b
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one

The forum itself is working but I’d like to have the performance reports as well…


(Sam Saffron) #12

What does

docker ps

return?


(oblio) #13
CONTAINER ID        IMAGE                 COMMAND             CREATED             STATUS              PORTS                                        NAMES
c05bf6a1b8fa        local_discourse/app   "/sbin/boot"        5 weeks ago         Up 22 hours         0.0.0.0:443->443/tcp, 0.0.0.0:2222->22/tcp   app

(Sam Saffron) #14

Stop that container and then rebuild


(oblio) #15

I found the root case. I had renamed the container “standalone” but apparently something went wrong and even though Discourse worked, some bits and pieces were still looking for “app”.

After a docker restart and OS restart the forum was running, as “app”, even though the container was “standalone.yml”.

I have no idea how this could have happened, isn’t the container name just the yml file name?

Can I force the execution of the performance report or do I have to wait for cron to trigger it?