Upgrade Issues: Failed Upgrade due to Duplicate Key, Failed Snapshot Restore

kirupa · March 10, 2021, 7:55am

I seem to have run into an issue when running the latest upgrade where it fails due to this:

PG::UniqueViolation: ERROR:  duplicate key value violates unique constraint "index_users_on_username_lower"
DETAIL:  Key (username_lower)=(scottie) already exists.

I tried to recover by attempting to delete this user by accessing either Rails console or the Postgres database directly, and I keep getting an error that the server isn’t running.

The following is me attempting to run rails after starting the app:

FAILED

--------------------

Pups::ExecError: cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate' failed with return #<Process::Status: pid 3743 exit 1>

Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'

exec failed with the params {"cd"=>"$home", "hook"=>"db_migrate", "cmd"=>["su discourse -c 'bundle exec rake db:migrate'"]}

c744729efb7a5813699e2fe8fa46ab551352d45ada8dcb9204121508364f8438

** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.

./discourse-doctor may help diagnose the problem.

root@kForum:/var/discourse# ./launcher start app

starting up existing container

+ /usr/bin/docker start app

app

root@kForum:/var/discourse# ./launcher enter app

root@kForum-app:/var/www/discourse# rails c

bundler: failed to load command: pry (/var/www/discourse/vendor/bundle/ruby/2.7.0/bin/pry)

PG::ConnectionBad: could not connect to server: No such file or directory

Is the server running locally and accepting

connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

The following is me attempting to run postgres:

root@kForum:/var/discourse# ./launcher enter app

root@kForum-app:/var/www/discourse# su discourse

discourse@kForum-app:/var/www/discourse$ psql discourse

psql: error: could not connect to server: No such file or directory

Is the server running locally and accepting

connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

discourse@kForum-app:/var/www/discourse$

I am restoring an earlier snapshot in DigitalOcean to get back to a working state, but the strange (scary?) part is that even restoring the snapshot didn’t seem to get the site back and up and running. This snapshot was taken over 24 hours ago when I know the forums were working fine, so it should be unrelated to the upgrade I attempted.

In my restored snapshot, this is the error that I see when I try to stop/start/restart the app:

root@kForum:/var/discourse# ./launcher restart app

+ /usr/bin/docker stop -t 10 app

app

starting up existing container

+ /usr/bin/docker start app

Error response from daemon: container "a1023d05a7b4de25ded1aa69ad49caed9fa59d15fa8e8130d32db82934139e6b": already exists

Error: failed to start containers: app

root@kForum:/var/discourse#

Lastly, I ran discourse-doctor, and this seems like a relevant part of the failure?

Checking cluster versions                                   ok

The source cluster was not shut down cleanly.
Failure, exiting
-------------------------------------------------------------------------------------
UPGRADE OF POSTGRES FAILED

Please visit https://meta.discourse.org/t/postgresql-13-update/172563 for support.

You can run ./launcher start app to restart your app in the meanwhile




FAILED
--------------------
Pups::ExecError: /root/upgrade_postgres failed with return #<Process::Status: pid 46 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params "/root/upgrade_postgres"
c98d1b076faed9b0c728cd944b3f2436afc9e266910b82a644c71fd80b42f073
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.
==================== END REBUILD LOG ====================
Failed to rebuild app.

Checking your domain name . . .

Connection to forum.kirupa.com succeeded.
You should probably remove any non-standard plugins and rebuild.
Attempting to restart existing container. . . 

starting up existing container
+ /usr/bin/docker start app
Error response from daemon: driver failed programming external connectivity on endpoint app (78a35e0f12e5af560bd1f991ead900fefd0c81570cc4582fe99aaa5e56b238e8): Error starting userland proxy: listen tcp 0.0.0.0:443: bind: address already in use
Error: failed to start containers: app
Failed to restart the container.


==================== PLUGINS ====================
          - git clone https://github.com/discourse/docker_manager.git
          - git clone https://github.com/discourse/discourse-solved.git
          - git clone https://github.com/discourse/discourse-sitemap.git

No non-official plugins detected.

See https://github.com/discourse/discourse/blob/master/lib/plugin/metadata.rb for the official list.

========================================
Discourse version at forum.kirupa.com: NOT FOUND
Discourse version at localhost: NOT FOUND

Thoughts on what is going on here? Getting the forums back is my first goal. Fixing the upgrade failures is my secondary goal.

Thanks,
Kirupa

GreXXL · March 10, 2021, 9:15am

You should look here - this seems to be the same issue: Help! Upgrade led to complete fail

pfaffman · March 10, 2021, 11:30am

I had what u think was a similar case yesterday (see Help! Upgrade led to complete fail - #8 by Judith). In broad strokes, I think you need to do something like

rename the postgres_data _old back to postgres_data
restart he old container
fix the corrupt index problem by renaming the duplicate users (or perhaps users that are duplicate because they have different capitalization
build the indexes on the user table to make sure they are all fixed
change to the postgres 12 template as described in PostgreSQL 13 update
rebuild to upgrade
when that works, you can change the template back and finish the upgrade.
note that you need to update not only the username but also the username_lower

The one that I fixed yesterday had a duplicate username as old as February 2020, so you likely don’t want to try to fix this by restoring an older database.

To fix it you need to understand how to modify records with the postgres command line , rebuild databases, and deal with complex, but well documented, issues in the postgres 13 upgrade.

If you have a $500 budget you can contact me. My contact info is in my profile.

kirupa · March 10, 2021, 12:54pm

@pfaffman - just pinged you via your contact form

pfaffman · March 10, 2021, 1:42pm

I replied. If you didn’t get it then maybe you put the wrong address?

kirupa · March 11, 2021, 5:17am

To close the loop on this, @pfaffman is a wizard. He solved the problem with the forum upgrade (and a host of related issues) very quickly

steelmaiden · April 25, 2023, 5:40pm

I am having the same issue, but not going to pay $500 for a solution. I have SFTP access to the server, so would it not be possible to just open the database in a notepad and rename the conflicted username?

Or if anyone can fix it for like $50, please contact me.

pfaffman · April 25, 2023, 6:18pm

You can check out PG::UniqueViolation Problem during 3.1.0.beta4 upgrade - #3 by ahmedeldeep where someone described pretty well what they did.

You’ll need an SSH client, not SFTP. If you’re using Digital Ocean you can use the console on their web site.

Topic		Replies	Views
Upgrade failed with "duplicate key value violates unique constraint" error Installation	18	2388	March 30, 2020
2.7.0.beta2 upgrade failed with ERROR: duplicate key Installation	25	2157	February 25, 2021
Urgent, upgraded build failed UniqueViolation Installation	24	1554	February 20, 2021
Upgrade Failed from 2.7.0.beta1 to 2.7.0.beta3 Installation	13	1888	February 15, 2021
Error during upgrading from Beta 3.1.x to latest Installation	3	806	January 20, 2023

Upgrade Issues: Failed Upgrade due to Duplicate Key, Failed Snapshot Restore

Related topics