PG13 compatibility issue 🔥

What is this about?

This migration introduces NULL NOT DISTINCT on an index on the problem_check_trackers table. DEV: Fix problem check tracker unique index not respecting NULLs by Drenmi · Pull Request #29169 · discourse/discourse · GitHub

What is the problem?

By default, when checking uniqueness on a tuple for the purposes of enforcing a unique index, PostgreSQL considers NULLs to be distinct values. Because of this we could incorrectly have multiple entries with { identifier: "rails_env", target: nil } created due to race conditions. This would then cause errors at runtime.

How does this solve it?

Drop the existing index and recreate it with the NULLS NOT DISTINCT option.

Problem

However, NULLS NOT DISTINCT was introduced in Postgres 15 beta2. The current Postgres version on a standard install is Postgres 13 and that does not support this feature.

Consequences

  • this change will not have any effect on PG13 since the NULLS NOT DISTINCT will be ignored (source)
  • attempting to restore a backup from a PG15 server to a PG13 server will fail with the following error
ERROR:  syntax error at or near "NULLS"
LINE 1: ...m_check_trackers USING btree (identifier, target) NULLS NOT ...
^
EXCEPTION: psql failed:                                                              ^
/var/www/discourse/lib/backup_restore/database_restorer.rb:92:in `restore_dump'

(full line: CREATE UNIQUE INDEX index_problem_check_trackers_on_identifier_and_target ON public.problem_check_trackers USING btree (identifier, target) NULLS NOT DISTINCT;)

@tgxworld @drenmi

9 Likes

@drenmi Looks like we have to revert the migration and reconsider another solution. This is probably causing self hosted installs to error out.

8 Likes

I have a PR up here that should work for PG13 as well. :pray:

We’re still considering how realistic this case is in the wild, and how much we should potentially invest in working around it. :pray:

1 Like

Well, I can’t speak for others and I don’t know whether I am truly representative (probably not) but I came across it twice within 3 days after that change was made…

So a workaround (and revert !) would be much appreciated.

1 Like

Now that we have the workaround, which should work on PG15 as well, we should be able to remove the NULLS NOT DISTINCT.

Out of curiosity, what were you doing that necessitated restoring a PG15 backup on PG13? :slightly_smiling_face: (It won’t affect the task above, just trying to understand what’s happening “in the wild” as much as possible.)

1 Like

We had one client who was attempting to restore a backup (I think they were self hosted and had been trying things beyond their knowledge level :upside_down_face:) , and we had another client who asked us to set up a staging site for custom plugin development purposes and we took a backup from CDCK hosting.

in general the built in versioning mechanism in the backup metadata works really well in being able to proactively determine when something is going to blow up or not, but these kind of situations* are like land mines :slight_smile:

(* Actually, the only other thing I can think of that is not covered by migration versioning is when a migration with an older date stamp is injected into main, but I digress)

3 Likes

Thanks for the info @RGJ! :pray:

The PR for removing the NULLS NOT DISTINCT OPTION is up:

3 Likes

I can see the issue is already resolved but to add an in-the-wild experience: I had this issue trying to restore a backup created on a Discourse hosted instance to a dev container I had set up locally using Docker as part of setting up a dev environment. Seems that Discourse hosting runs PG 15 but the dev environment is 13?

2 Likes

Yeah this is the root of the issue, we need to update our open source container to 15. we will get to it over the next few months.

4 Likes