PostgreSQL 12 update

:warning: WARNING! If your database is very large, you will need a lot of extra disk space (2x database size) and should be very careful with this upgrade!

We just landed the long waited PostgreSQL major version upgrade. Any site admins rebuilding Discourse from the command line will be upgraded to PostgreSQL 12 from the old PostgreSQL 10.

We are running this new version for a while on Meta, and everything is working fine. PostgreSQL 12 brings lots of improvements that will be automatically leveraged by Discourse.

Updating

Official Install Guide (single container)

On you next rebuild, you will see this message at the end:

Upgrade Complete
----------------

To complete the upgrade, rebuild again using:

./launcher rebuild app
-------------------------------------------------------------------------------------

That means everything went well in the upgrade! You just need to issue a new rebuild to get your site back and running.

Data Container Install

If you are running a setup with a dedicated data container based in the sample supplied in our discourse_docker repository, you want to be sure you are shutting down PostgreSQL in a safe and clean way.

Nowadays, we have background jobs running queries spanning several minutes, so shutting down the web container will help the data container be shutdown safely.

./launcher stop web_only
./launcher stop data
./launcher rebuild data
./launcher rebuild data
./launcher rebuild web_only

Before issuing the first rebuild to the data container, you can tail the PostgreSQL log to see if it was shutdown properly.

Running a tail -f shared/data/log/var-log/postgres/current should give you the following log if it was clean:

2020-05-13 18:33:33.457 UTC [36] LOG:  received smart shutdown request
2020-05-13 18:33:33.464 UTC [36] LOG:  worker process: logical replication launcher (PID 52) exited with exit code 1
2020-05-13 18:33:33.465 UTC [47] LOG:  shutting down
2020-05-13 18:33:33.479 UTC [36] LOG:  database system is shut down

Doing a manual update / space constrained environments

:warning::warning::warning:
YOU MUST BACKUP THE POSTGRES_DATA BEFORE TRYING THIS
:warning::warning::warning:

If you are in a constrained space environment without any way to get more space you can try the following:

./launcher stop app #(or both web_only and data if that is your case)
mkdir -p /var/discourse/shared/standalone/postgres_data_new
docker run --rm \
	-v /var/discourse/shared/standalone/postgres_data:/var/lib/postgresql/10/data \
	-v /var/discourse/shared/standalone/postgres_data_new:/var/lib/postgresql/12/data \
	tianon/postgres-upgrade:10-to-12
mv /var/discourse/shared/standalone/postgres_data /var/discourse/shared/standalone/postgres_data_old
mv /var/discourse/shared/standalone/postgres_data_new /var/discourse/shared/standalone/postgres_data
./launcher rebuild app #(or first data and then web_only if that is your case)

On my tests this procedure requires less than 1x your current database size in free space.

Postponing the update

If you need to postpone the update during your next rebuild, you can swap the PostgreSQL template on your app.yml file by changing "templates/postgres.template.yml" to "templates/postgres.10.template.yml".

This is not recommended, as some site admins will forget to revert the change afterwards.

Post update optional tasks

Optimizing PostgreSQL statistics

After the update, the new PostgreSQL won’t have table statistics on hand. You can generate those using:

cd /var/discourse
./launcher enter app
su postgres
psql
\connect discourse
VACUUM VERBOSE ANALYZE;
\q
exit
exit

Or this one-line version of the above:

/var/discourse/launcher run app "echo 'vacuum verbose analyze;' | su postgres -c 'psql discourse'"

Cleaning up old data

You can delete the old data in PG10 format by removing the following folder:

cd /var/discourse
./launcher cleanup

FAQ

The source cluster was not shut down cleanly

If you get a upgrade failed with the above message, you can try a simpler approach to get it back into a better state.

Restart the old container with ./launcher start app. Wait a few minutes until it is back up.

Now shut it down again with ./launcher stop app. After that tail the logs to see if it was a clean one:

tail -f shared/data/log/var-log/postgres/current
2020-05-13 18:33:33.457 UTC [36] LOG:  received smart shutdown request
2020-05-13 18:33:33.464 UTC [36] LOG:  worker process: logical replication launcher (PID 52) exited with exit code 1
2020-05-13 18:33:33.465 UTC [47] LOG:  shutting down
2020-05-13 18:33:33.479 UTC [36] LOG:  database system is shut down

If the logs look like above, you can now try to upgrade again using ./launcher rebuild app.

lc_collate values for database “postgres” do not match

This errors happens if you are using non-default locales for your database. It was reported that you need 3 variables for it to succeed. Ensure that the env: section of you app.yml file has the 3 lines:

  LC_ALL: en_US.UTF-8
  LANG: en_US.UTF-8
  LANGUAGE: en_US.UTF-8

Changing en_US.UTF-8 to your locale.

Every rebuild does the upgrade again aka upgrade loop

When this happens your upgrade logs will contain

mv: cannot move '/shared/postgres_data' to '/shared/postgres_data_old/postgres_data': Directory not empty
mv: cannot move '/shared/postgres_data_new' to '/shared/postgres_data/postgres_data_new': Directory not empty

This means that there are still files from the last upgrade lingering around. Move those elsewhere before continuing.

62 Likes

I encountered a very similar rebuild loop to the one covered in the FAQ, except the error message (and cause) are slightly different.

mv: cannot move '/shared/postgres_data' to '/shared/postgres_data_old': Device or resource busy
mv: inter-device move failed: '/shared/postgres_data_new' to '/shared/postgres_data/postgres_data_new'; unable to remove target: Directory not empty

I suspect the first error is because of nested mounts. From our app.py:

volumes:
  - volume:
      host: /mnt/forum_backup_01/shared/standalone
      guest: /shared
  - volume:
      host: /mnt/forum_backup_01/shared/standalone/log/var-log
      guest: /var/log
  - volume:
      host: /var/postgres_data_discourse
      guest: /shared/postgres_data

If so, I plan to do the following, but am posting here to be sure it isn’t going to break anything:

./launcher stop app
cp -R /var/postgres_data_discourse/* /mnt/forum_backup_01/shared/standalone/postgres_data
vim /var/discourse/containers/app.yml
    # Remove the  /var/postgres_data_discourse docker volume
    # Remove the 10. from "templates/postgres.10.template.yml".
./launcher rebuild app

Is mounting postgres_data like I have in /shared non-standard? I don’t recall why our deployment is configured this way.

1 Like