Postgres Errors on Rebuild


(Thomas Purchas) #1

When attempting to rebuild a discourse instance it always fails with a FAILED TO BOOTSTRAP error that seems to be linked to Postgres. Looking at the logs, and playing Postgres in the containers it seems that Postgres is taking forever to startup and shutdown, and thus the SQL upgrade commands are executed before Postgres has had time to start properly.

I can’t figure out how to get Postgres to behave, I’ve tried manually shutting it down before the stopping the docker container to ensure that I shuts down properly, but that still doesn’t help :frowning:

Ensuring discourse docker is up to date
Fetching origin
Discourse Docker is up-to-date
Stopping old container
+ /usr/bin/docker stop -t 10 app
app
cd /pups && git pull && /pups/bin/pups --stdin
Already up-to-date.
I, [2015-09-16T19:45:54.861281 #37]  INFO -- : Loading --stdin
I, [2015-09-16T19:45:54.889322 #37]  INFO -- : > mkdir -p /shared/postgres_run
I, [2015-09-16T19:45:54.909013 #37]  INFO -- : 
I, [2015-09-16T19:45:54.910480 #37]  INFO -- : > chown postgres:postgres /shared/postgres_run
I, [2015-09-16T19:45:54.926713 #37]  INFO -- : 
I, [2015-09-16T19:45:54.928089 #37]  INFO -- : > chmod 775 /shared/postgres_run
I, [2015-09-16T19:45:54.948182 #37]  INFO -- : 
I, [2015-09-16T19:45:54.949350 #37]  INFO -- : > rm -fr /var/run/postgresql
I, [2015-09-16T19:45:54.960375 #37]  INFO -- : 
I, [2015-09-16T19:45:54.961578 #37]  INFO -- : > ln -s /shared/postgres_run /var/run/postgresql
I, [2015-09-16T19:45:54.974970 #37]  INFO -- : 
I, [2015-09-16T19:45:54.976072 #37]  INFO -- : > socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGSQL.5432 || exit 0 && echo postgres already running stop container ; exit 1
2015/09/16 19:45:55 socat[45] E connect(4, AF=1 "/shared/postgres_run/.s.PGSQL.5432", 36): Connection refused
I, [2015-09-16T19:45:55.072442 #37]  INFO -- : 
I, [2015-09-16T19:45:55.073452 #37]  INFO -- : > rm -fr /shared/postgres_run/.s*
I, [2015-09-16T19:45:55.079187 #37]  INFO -- : 
I, [2015-09-16T19:45:55.080058 #37]  INFO -- : > rm -fr /shared/postgres_run/*.pid
I, [2015-09-16T19:45:55.085358 #37]  INFO -- : 
I, [2015-09-16T19:45:55.086623 #37]  INFO -- : > mkdir -p /shared/postgres_run/9.3-main.pg_stat_tmp
I, [2015-09-16T19:45:55.090479 #37]  INFO -- : 
I, [2015-09-16T19:45:55.091570 #37]  INFO -- : > chown postgres:postgres /shared/postgres_run/9.3-main.pg_stat_tmp
I, [2015-09-16T19:45:55.095324 #37]  INFO -- : 
I, [2015-09-16T19:45:55.104215 #37]  INFO -- : File > /etc/service/postgres/run  chmod: +x
I, [2015-09-16T19:45:55.113364 #37]  INFO -- : File > /etc/runit/3.d/99-postgres  chmod: +x
I, [2015-09-16T19:45:55.122626 #37]  INFO -- : File > /root/upgrade_postgres  chmod: +x
I, [2015-09-16T19:45:55.125918 #37]  INFO -- : > chown -R root /var/lib/postgresql/9.3/main
I, [2015-09-16T19:45:55.151941 #37]  INFO -- : 
I, [2015-09-16T19:45:55.152848 #37]  INFO -- : > [ ! -e /shared/postgres_data ] && install -d -m 0755 -o postgres -g postgres /shared/postgres_data && sudo -E -u postgres /usr/lib/postgresql/9.3/bin/initdb -D /shared/postgres_data || exit 0
I, [2015-09-16T19:45:55.157108 #37]  INFO -- : 
I, [2015-09-16T19:45:55.157601 #37]  INFO -- : > chown -R postgres:postgres /shared/postgres_data
I, [2015-09-16T19:45:55.192969 #37]  INFO -- : 
I, [2015-09-16T19:45:55.193638 #37]  INFO -- : > chown -R postgres:postgres /var/run/postgresql
I, [2015-09-16T19:45:55.198057 #37]  INFO -- : 
I, [2015-09-16T19:45:55.199366 #37]  INFO -- : > /root/upgrade_postgres
I, [2015-09-16T19:45:55.208457 #37]  INFO -- : 
I, [2015-09-16T19:45:55.209707 #37]  INFO -- : > rm /root/upgrade_postgres
I, [2015-09-16T19:45:55.213369 #37]  INFO -- : 
I, [2015-09-16T19:45:55.224295 #37]  INFO -- : Replacing data_directory = '/var/lib/postgresql/9.3/main' with data_directory = '/shared/postgres_data' in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.225981 #37]  INFO -- : Replacing (?-mix:#?listen_addresses *=.*) with listen_addresses = '*' in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.231532 #37]  INFO -- : Replacing (?-mix:#?synchronous_commit *=.*) with synchronous_commit = $db_synchronous_commit in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.233514 #37]  INFO -- : Replacing (?-mix:#?shared_buffers *=.*) with shared_buffers = $db_shared_buffers in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.236103 #37]  INFO -- : Replacing (?-mix:#?work_mem *=.*) with work_mem = $db_work_mem in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.238310 #37]  INFO -- : Replacing (?-mix:#?default_text_search_config *=.*) with default_text_search_config = '$db_default_text_search_config' in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.241234 #37]  INFO -- : > install -d -m 0755 -o postgres -g postgres /shared/postgres_backup
I, [2015-09-16T19:45:55.345916 #37]  INFO -- : 
I, [2015-09-16T19:45:55.346424 #37]  INFO -- : Replacing (?-mix:#?max_wal_senders *=.*) with max_wal_senders = $db_max_wal_senders in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.347808 #37]  INFO -- : Replacing (?-mix:#?wal_level *=.*) with wal_level = $db_wal_level in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.349858 #37]  INFO -- : Replacing (?-mix:#?checkpoint_segments *=.*) with checkpoint_segments = $db_checkpoint_segments in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.371186 #37]  INFO -- : Replacing (?-mix:#?logging_collector *=.*) with logging_collector = $db_logging_collector in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.373703 #37]  INFO -- : Replacing (?-mix:#?log_min_duration_statement *=.*) with log_min_duration_statement = $db_log_min_duration_statement in /etc/postgresql/9.3/main/postgresql.conf
I, [2015-09-16T19:45:55.401660 #37]  INFO -- : Replacing (?-mix:^#local +replication +postgres +peer$) with local replication postgres  peer in /etc/postgresql/9.3/main/pg_hba.conf
I, [2015-09-16T19:45:55.402882 #37]  INFO -- : Replacing (?-mix:^host.*all.*all.*127.*$) with host all all 0.0.0.0/0 md5 in /etc/postgresql/9.3/main/pg_hba.conf
I, [2015-09-16T19:45:55.412008 #37]  INFO -- : > HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/9.3/bin/postmaster -D /etc/postgresql/9.3/main
I, [2015-09-16T19:45:55.435160 #37]  INFO -- : > sleep 5
2015-09-16 19:45:55 UTC [69-1] LOG:  database system was interrupted; last known up at 2015-09-16 19:35:16 UTC
I, [2015-09-16T19:46:00.518385 #37]  INFO -- : 
I, [2015-09-16T19:46:00.520790 #37]  INFO -- : > su postgres -c 'createdb discourse' || true
2015-09-16 19:46:01 UTC [73-1] postgres@postgres FATAL:  the database system is starting up
2015-09-16 19:46:01 UTC [74-1] postgres@template1 FATAL:  the database system is starting up
createdb: could not connect to database template1: FATAL:  the database system is starting up
I, [2015-09-16T19:46:01.467807 #37]  INFO -- : 
I, [2015-09-16T19:46:01.469540 #37]  INFO -- : > su postgres -c 'psql discourse -c "create user discourse;"' || true
2015-09-16 19:46:01 UTC [85-1] postgres@discourse FATAL:  the database system is starting up
psql: FATAL:  the database system is starting up
I, [2015-09-16T19:46:01.783704 #37]  INFO -- : 
I, [2015-09-16T19:46:01.785473 #37]  INFO -- : > su postgres -c 'psql discourse -c "grant all privileges on database discourse to discourse;"' || true
2015-09-16 19:46:01 UTC [96-1] postgres@discourse FATAL:  the database system is starting up
psql: FATAL:  the database system is starting up
I, [2015-09-16T19:46:01.894015 #37]  INFO -- : 
I, [2015-09-16T19:46:01.895739 #37]  INFO -- : > su postgres -c 'psql discourse -c "alter schema public owner to discourse;"'
2015-09-16 19:46:01 UTC [107-1] postgres@discourse FATAL:  the database system is starting up
psql: FATAL:  the database system is starting up
I, [2015-09-16T19:46:02.007073 #37]  INFO -- : 
I, [2015-09-16T19:46:02.008035 #37]  INFO -- : Terminating async processes
I, [2015-09-16T19:46:02.008605 #37]  INFO -- : Sending INT to HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/9.3/bin/postmaster -D /etc/postgresql/9.3/main pid: 66
2015-09-16 19:46:02 UTC [66-1] LOG:  received fast shutdown request
I, [2015-09-16T19:46:12.010101 #37]  INFO -- : HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/9.3/bin/postmaster -D /etc/postgresql/9.3/main pid:66 did not terminate cleanly, forcing termination!


FAILED
--------------------
RuntimeError: su postgres -c 'psql discourse -c "alter schema public owner to discourse;"' failed with return #<Process::Status: pid 97 exit 2>
Location of failure: /pups/lib/pups/exec_command.rb:105:in `spawn'
exec failed with the params "su postgres -c 'psql $db_name -c \"alter schema public owner to $db_user;\"'"
b3196e0243c355adc35816476f63c39ac641d85ab51a9898e94a42715a884d78
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one

(Sam Saffron) #2

This is going to be tricky… recommend you start a container manually mounting shared and try to run the commands in the container by hand.


(Thomas Purchas) #3

I just re-ran the rebuild, and it worked :confused:. All I did was manually start and stop the container, and this time it worked (I had tried this multiple time before without it working).

This is also not the first time that this has happened on this install. I suspect that I may be related to the Hypervisor that this VM is running in.

Thanks for your help all the same @sam.


(Régis Hanol) #4