PostgreSQL 13 update

:warning: WARNING! If your database is very large, you will need a lot of extra disk space (2x database size) and should be very careful with this upgrade!

We’ve just landed changes to upgrade our Docker image to PostgreSQL 13. Any site admins rebuilding Discourse from the command line will be upgraded to PostgreSQL 13 from the previous PostgreSQL 12. Note that if you held back from upgrading when the PostgreSQL 12 update happened back in May, you can skip that upgrade and go straight to PostgreSQL 13.

If you had held back the upgrade previously, change the PostgreSQL template in app.yml from templates/postgres.10.template.yml to templates/postgres.template.yml.

As with any upgrade, it is strongly advised to take a backup before doing anything.

Updating

Official Install Guide (single container)

On you next rebuild, you will see this message at the end:

-------------------------------------------------------------------------------------
UPGRADE OF POSTGRES COMPLETE

Old 12 database is stored at /shared/postgres_data_old

To complete the upgrade, rebuild again using:

./launcher rebuild app
-------------------------------------------------------------------------------------

That means everything went well in the upgrade! You just need to issue a new rebuild to get your site back and running.

Data Container Install

If you are running a setup with a dedicated data container based in the sample supplied in our discourse_docker repository, you want to be sure you are shutting down PostgreSQL in a safe and clean way.

Nowadays, we have background jobs running queries spanning several minutes, so shutting down the web container will help the data container be shutdown safely.

./launcher stop web_only
./launcher stop data
./launcher rebuild data
./launcher rebuild data
./launcher rebuild web_only

Before issuing the first rebuild to the data container, you can tail the PostgreSQL log to see if it was shutdown properly.

Running a tail -f shared/data/log/var-log/postgres/current should give you the following log if it was clean:

2020-05-13 18:33:33.457 UTC [36] LOG:  received smart shutdown request
2020-05-13 18:33:33.464 UTC [36] LOG:  worker process: logical replication launcher (PID 52) exited with exit code 1
2020-05-13 18:33:33.465 UTC [47] LOG:  shutting down
2020-05-13 18:33:33.479 UTC [36] LOG:  database system is shut down

Doing a manual update / space constrained environments

:warning::warning::warning:
YOU MUST BACKUP THE POSTGRES_DATA BEFORE TRYING THIS
:warning::warning::warning:

If you are in a constrained space environment without any way to get more space you can try the following:

./launcher stop app #(or both web_only and data if that is your case)
mkdir -p /var/discourse/shared/standalone/postgres_data_new
docker run --rm \
	-v /var/discourse/shared/standalone/postgres_data:/var/lib/postgresql/12/data \
	-v /var/discourse/shared/standalone/postgres_data_new:/var/lib/postgresql/13/data \
	tianon/postgres-upgrade:12-to-13
mv /var/discourse/shared/standalone/postgres_data /var/discourse/shared/standalone/postgres_data_old
mv /var/discourse/shared/standalone/postgres_data_new /var/discourse/shared/standalone/postgres_data
./launcher rebuild app #(or first data and then web_only if that is your case)

On my tests this procedure requires less than 1x your current database size in free space.

Postponing the update

If you need to postpone the update during your next rebuild, you can swap the PostgreSQL template on your app.yml file by changing "templates/postgres.template.yml" to "templates/postgres.12.template.yml".

This is not recommended, as some site admins will forget to revert the change afterwards.

Post update optional tasks

Optimizing PostgreSQL statistics

After the update, the new PostgreSQL won’t have table statistics on hand. You can generate those using:

cd /var/discourse
./launcher enter app
su postgres
psql
\connect discourse
VACUUM VERBOSE ANALYZE;
\q
exit
exit

Or this one-line version of the above:

/var/discourse/launcher run app "echo 'vacuum verbose analyze;' | su postgres -c 'psql discourse'"

Re-creating the indexes

This upgrade main feature is great file savings in our largest table in every instance, the post_timings table and it’s indexes. After doing a successful update you will need to run a command to re-build the indexes and reap the benefits.

cd /var/discourse
./launcher enter app
su postgres
psql
\connect discourse
REINDEX SCHEMA CONCURRENTLY public;
\q
exit
exit

If you can check the post_timings size before and after the REINDEX that would be a cool stat to share here!

You can use the below query to check the 20 largest data objects, run it before the reindex and after:

WITH RECURSIVE pg_inherit(inhrelid, inhparent) AS
    (select inhrelid, inhparent
    FROM pg_inherits
    UNION
    SELECT child.inhrelid, parent.inhparent
    FROM pg_inherit child, pg_inherits parent
    WHERE child.inhparent = parent.inhrelid),
pg_inherit_short AS (SELECT * FROM pg_inherit WHERE inhparent NOT IN (SELECT inhrelid FROM pg_inherit))
SELECT table_schema
    , TABLE_NAME
    , row_estimate
    , pg_size_pretty(total_bytes) AS total
    , pg_size_pretty(index_bytes) AS INDEX
    , pg_size_pretty(toast_bytes) AS toast
    , pg_size_pretty(table_bytes) AS TABLE
  FROM (
    SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes
    FROM (
         SELECT c.oid
              , nspname AS table_schema
              , relname AS TABLE_NAME
              , SUM(c.reltuples) OVER (partition BY parent) AS row_estimate
              , SUM(pg_total_relation_size(c.oid)) OVER (partition BY parent) AS total_bytes
              , SUM(pg_indexes_size(c.oid)) OVER (partition BY parent) AS index_bytes
              , SUM(pg_total_relation_size(reltoastrelid)) OVER (partition BY parent) AS toast_bytes
              , parent
          FROM (
                SELECT pg_class.oid
                    , reltuples
                    , relname
                    , relnamespace
                    , pg_class.reltoastrelid
                    , COALESCE(inhparent, pg_class.oid) parent
                FROM pg_class
                    LEFT JOIN pg_inherit_short ON inhrelid = oid
                WHERE relkind IN ('r', 'p')
             ) c
             LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
  ) a
  WHERE oid = parent
) a
ORDER BY total_bytes DESC LIMIT 20;

Cleaning up old data

For a standard install, you can delete the old data in PG12 format with the following command:

cd /var/discourse
./launcher cleanup

If you have a separate data container, you’ll need to remove the backup copy like this:

rm -fr /var/discourse/shared/data/postgres_data_old/

FAQ

The source cluster was not shut down cleanly

If you get a upgrade failed with the above message, you can try a simpler approach to get it back into a better state.

Restart the old container with ./launcher start app. Wait a few minutes until it is back up.

Now shut it down again with ./launcher stop app. After that tail the logs to see if it was a clean one:

tail -f shared/data/log/var-log/postgres/current
2020-05-13 18:33:33.457 UTC [36] LOG:  received smart shutdown request
2020-05-13 18:33:33.464 UTC [36] LOG:  worker process: logical replication launcher (PID 52) exited with exit code 1
2020-05-13 18:33:33.465 UTC [47] LOG:  shutting down
2020-05-13 18:33:33.479 UTC [36] LOG:  database system is shut down

If the logs look like above, you can now try to upgrade again using ./launcher rebuild app.

lc_collate values for database “postgres” do not match

This errors happens if you are using non-default locales for your database. It was reported that you need 3 variables for it to succeed. Ensure that the env: section of you app.yml file has the 3 lines:

  LC_ALL: en_US.UTF-8
  LANG: en_US.UTF-8
  LANGUAGE: en_US.UTF-8

Changing en_US.UTF-8 to your locale.

Every rebuild does the upgrade again aka upgrade loop

When this happens your upgrade logs will contain

mv: cannot move '/shared/postgres_data' to '/shared/postgres_data_old/postgres_data': Directory not empty
mv: cannot move '/shared/postgres_data_new' to '/shared/postgres_data/postgres_data_new': Directory not empty

This means that there are still files from the last upgrade lingering around. Move those elsewhere before continuing.

Upgrade Complete suggestion scripts - do I need to do anything?

Once the upgrade completes, you’ll see output from the pg_upgrade message saying:

Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
    ./analyze_new_cluster.sh

Running this script will delete the old cluster's data files:
    ./delete_old_cluster.sh

You can safely ignore this message.

I did skip the PostgreSQL 12 update, what to do now?

You can follow the standard instructions at the top of this guide and they will upgrade from your version to 13 without issues.

If you are following the space constrained instructions, adapt the version numbers accordingly.

38 Likes