Upgrade failed

I run Discourse on Debian 11 with Docker as a single container.

I tried to update it using ./launcher rebuild app

It fails with this message

I, [2023-01-04T20:53:09.920876 #1]  INFO -- : > cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate'
rake aborted!

I do not find a way to get it up and running again.

Any ideas?

I see that the owner is wrong

drwxr-xr-x 15 sshd             netdev          4096 Jan  4 21:43 .
drwxr-xr-x  3 root             root            4096 Jan  3  2018 ..
drwxr-xr-x  3             1000 www-data        4096 Jan  3  2018 backups
drwxr-xr-x  8 sshd             netdev          4096 Feb  2  2021 letsencrypt
drwxr-xr-x  4 sshd             netdev          4096 Jan  3  2018 log
drwxr-xr-x  2 systemd-timesync systemd-resolve 4096 Jan  3  2018 postgres_backup
drwx------ 19 systemd-timesync systemd-resolve 4096 Jan  4 21:53 postgres_data
drwx------ 19 sshd             netdev          4096 Jan  4 20:49 postgres_data_new
drwxrwsr-x  6 systemd-timesync systemd-resolve 4096 Jan  4 21:53 postgres_run
drwxr-xr-x  2 systemd-resolve  kvm             4096 Jan  4 21:53 redis_data
drwxr-xr-x  2 sshd             netdev          4096 Jan 22  2021 ssl
drwxr-xr-x  2 sshd             netdev          4096 Jan 21  2021 ssl_old
drwxr-xr-x  4 sshd             netdev          4096 Jan  3  2018 state
drwxr-xr-x  4             1000 www-data        4096 Jan  4 21:28 tmp
drwxr-xr-x  4             1000 www-data        4096 Jan  5  2018 uploads

I start the container using ./launcher start app. Than enter the container: ./launcher enter app.

I reset the ownership chown -R postgres:postgres /shared/

Afterwards it’s corrected. But when I rebuild the app again the owner is wrong again…

This isn’t the error, it’s going to be higher up, we’re going to need to see more of the log.

2023-01-04 20:48:05.355 UTC [41] LOG:  starting PostgreSQL 13.9 (Debian 13.9-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2023-01-04 20:48:05.377 UTC [41] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2023-01-04 20:48:05.377 UTC [41] LOG:  listening on IPv6 address "::", port 5432
2023-01-04 20:48:05.566 UTC [41] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-01-04 20:48:05.734 UTC [44] LOG:  database system was shut down at 2023-01-04 20:46:17 UTC
2023-01-04 20:48:05.878 UTC [41] LOG:  database system is ready to accept connections
I, [2023-01-04T20:48:09.779985 #1]  INFO -- :
I, [2023-01-04T20:48:09.780390 #1]  INFO -- : > su postgres -c 'createdb discourse' || true
2023-01-04 20:48:10.014 UTC [54] postgres@postgres ERROR:  database "discourse" already exists
2023-01-04 20:48:10.014 UTC [54] postgres@postgres STATEMENT:  CREATE DATABASE discourse;
createdb: error: database creation failed: ERROR:  database "discourse" already exists
I, [2023-01-04T20:48:10.017003 #1]  INFO -- :
I, [2023-01-04T20:48:10.017425 #1]  INFO -- : > su postgres -c 'psql discourse -c "create user discourse;"' || true
2023-01-04 20:48:10.188 UTC [58] postgres@discourse ERROR:  role "discourse" already exists
2023-01-04 20:48:10.188 UTC [58] postgres@discourse STATEMENT:  create user discourse;
ERROR:  role "discourse" already exists
129:M 04 Jan 2023 20:48:21.224 # Failed listening on port 6379 (TCP), aborting.

I do not see other errors.

:man_shrugging:

Inside the container I try to start the service postgresql and get an error.

root@server /var/discourse # ./launcher enter app
x86_64 arch detected.
root@discourse:/var/www/discourse# service postgresql start
[FAIL] Starting PostgreSQL 13 database server: main[....] Error: Config owner (postgres:105) and data owner (systemd-timesync:101) do not match, and config owner is not root ... failed!
 failed!
root@discourse:/var/www/discourse#

If you have changed owners of the files inside the shared folder you will break the install. One option is reinstalling and restoring a backup, while the other is manually fixing those owners.

1 Like

@Falco: thank you!

I changed the owners after the upgrade failed. I found the chown hint somewhere in a post.

How can I create a backup in the current state?

How can I fix the owners manually?

Thanks again!

inside the container I tried discourse backup. It reports that Redis is not running. In the “current” Redis log I found the following lines at the end…

10316:M 05 Jan 2023 08:05:27.314 # Server initialized
10316:M 05 Jan 2023 08:05:27.314 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
10316:M 05 Jan 2023 08:05:27.314 # Can't handle RDB format version 10
10316:M 05 Jan 2023 08:05:27.314 # Fatal error loading the DB: Invalid argument. Exiting.
10321:C 05 Jan 2023 08:05:28.345 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
10321:C 05 Jan 2023 08:05:28.345 # Redis version=6.2.3, bits=64, commit=00000000, modified=0, pid=10321, just started
10321:C 05 Jan 2023 08:05:28.345 # Configuration loaded
10321:M 05 Jan 2023 08:05:28.346 * monotonic clock: POSIX clock_gettime
10321:M 05 Jan 2023 08:05:28.347 * Running mode=standalone, port=6379.
10321:M 05 Jan 2023 08:05:28.347 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
10321:M 05 Jan 2023 08:05:28.347 # Server initialized
10321:M 05 Jan 2023 08:05:28.347 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
10321:M 05 Jan 2023 08:05:28.348 # Can't handle RDB format version 10
10321:M 05 Jan 2023 08:05:28.348 # Fatal error loading the DB: Invalid argument. Exiting.

I fixed the permissions like this (inside the container):

Afterwards I restarted the container with ./launcher restart app Now I can access Discourse. But it’s the old version 2.8.3 that I tried to upgrade to 3.0.0.beta16 yesterday.

I’m not sure how to proceed to upgrade Discourse.

I think my problem is related to this thread: Problem upgrading multi-site/multi-containers Discourse instance - #5 by jtraulle

I remember that I had upgrade problems before but never investigated them.

./launcher rebuild app

I was able to set the version to 2.9.0.beta2 (commit id: 88a8584348ed93a28286839bfc1c32b06bd50b3f) by setting the commit id as “version” in app.yml. This time the upgrade worked. After that I was able to upgrade to 3.0.0.beta16.

Thanks to all.

5 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.