Discourse upgrade error "FAILED TO BOOTSTRAP"

Hi All-

First post here, thanks in advance for having me. Running a routine albeit overdue set of upgrades on our Discourse-powered forum ( https://forum.troygrady.com ), and we’ve hit a point where the upgrade fails after doing the command line “git pull” and “rebuild” steps as advised by the on-screen instructions.

See below for the output of the “./launcher rebuild app” command. I also ran the “discourse doctor” script, and can post or forward a link to that output as well, provided you think that wouldn’t pose too much of a security concern.

I should note that while I’m a reasonably competent PHP / SQL developer with occasional Linux application admin experience, I’m not at all technical with Discourse, and I’m not the one who set up our initial install. Your favorite, I know!

I’m just following the on-screen instructions here, which began with clicking the blue “upgrade” buttons in the UI for the docker. Once that was complete, I saw the on-screen instruction to log in via the command line and run the git pull and launcher rebuild. That’s how I’ve arrived at this point.

I’d also add that our forum was running absolutely perfectly prior to this, with no issues at all, if that helps diagnose. The only reason we’re performing this upgrade is simply to stay current with stuff you guys are releasing so as not to get too far out of date. This is the central conflict of my “ain’t broke, don’t fix it” mentality, fearing that upgrading will cause some error that’s beyond my ability to fix. And indeed, here we are.

As of this typing, the forum is completely offline, and given that this is a core component of our business, I’d love to get it operational as soon as possible.

Any insight greatly appreciated!

FAILED

Pups::ExecError: cd /var/www/discourse && su discourse -c ‘bundle exec rake db:migrate’ failed with return #<Process::Status: pid 3972 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn’
exec failed with the params {“cd”=>"$home", “hook”=>“db_migrate”, “cmd”=>[“su discourse -c ‘bundle exec rake db:migrate’”]}
f89318158c2c276c69a60d600def8a838ae4ad4bc7bafbe665fb1cd77c130ad1
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.

hey, welcome to the discourse community.

Where are you installing, what OS? You are following our official guide?

I think you might have earlier error messages indeed, it looks like your database (server) might be inaccessible.

Maybe you should consider managed hosting once you have your forum up and running again…

1 Like

Hi Gavin! Thanks for the speedy response. We’re on a Droplet via Digital Ocean, and when I log in it looks like:

Welcome to Ubuntu 16.04.6 LTS (GNU/Linux 4.4.0-210-generic x86_64)

As far as using your official guide, that I can’t say. This is an installation that we’ve been running for maybe 3-4 years without issue, though I was not the individual that set it up initially. It has typically only required in-browser upgrades, and the occasional command-line rebuild, which have all worked with essentially no other input from us until now.

I have the entire terminal output of the rebuild saved to a file and can scan that. But we’re in a virtual droplet environment and have not changed anything about it since we set it up. In fact, we rarely log in since Discourse hums along with the in-browser upgrades. So I’m not sure what would have changed to suddenly make the db inaccessible.

Sure. This is a case of things working fine for years on end, so there’s not much twisting our arm to change it. But we’d be happy to hire someone to take a look occasionally at what we have installed and make sure things are up to date, rather than me doing it. Is there a resource or directory for locating Discourse pros who might be open to this sort of thing?

Thanks!

you would need to look at updating that. its old. very old.

but lets 1st get you up and running. can you post the error log. need to see what the actual error is

Ok, grepping for warnings and errors in the rebuild output, here’s what I’ve got (below).

227:initdb: warning: enabling “trust” authentication for local connections
294:update-alternatives: warning: forcing reinstallation of alternative /usr/share/postgresql/13/man/man1/psql.1.gz because link group psql.1.gz is broken
324:update-alternatives: warning: forcing reinstallation of alternative /usr/share/postgresql/13/man/man1/postmaster.1.gz because link group postmaster.1.gz is broken

1684:createdb: error: database creation failed: ERROR: database “discourse” already exists
1811:I, [2021-08-29T20:18:40.246150 #1] INFO – : > cd /var/www/discourse && bash -c “touch -a /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log”
1813:I, [2021-08-29T20:18:40.253584 #1] INFO – : > cd /var/www/discourse && bash -c “ln -s /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log /var/www/discourse/log”
2563:StandardError: An error has occurred, this and all later migrations canceled:
2698:-- add_column(:groups, :imap_last_error, :text)
2961:** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
3118:createdb: error: database creation failed: ERROR: database “discourse” already exists
3245:I, [2021-08-29T20:22:40.262592 #1] INFO – : > cd /var/www/discourse && bash -c “touch -a /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log”
3247:I, [2021-08-29T20:22:40.274767 #1] INFO – : > cd /var/www/discourse && bash -c “ln -s /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log /var/www/discourse/log”
3960:StandardError: An error has occurred, this and all later migrations canceled:
4087:** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
4224:/error – search for the word ‘error’
4358:createdb: error: database creation failed: ERROR: database “discourse” already exists
4485:I, [2021-08-29T20:26:59.373901 #1] INFO – : > cd /var/www/discourse && bash -c “touch -a /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log”
4487:I, [2021-08-29T20:26:59.381142 #1] INFO – : > cd /var/www/discourse && bash -c “ln -s /shared/log/rails/{production,production_errors,unicorn.stdout,unicorn.stderr,sidekiq}.log /var/www/discourse/log”
5200:StandardError: An error has occurred, this and all later migrations canceled:
5327:** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.

Thanks again for taking a look at this Gavin, and sorry for the neglected OS upgrades.

Above is just what I get when grepping for words “error” and “warning” in the script output, the entirety of which I have saved here. If there’s anything else I should be looking for in that output, just let me know, happy to post.

Ah, sorry, I realize that output is somewhat less than explanatory. Here’s the detail around the “StandardError” message. Looks like an INSERT query failed because of a duplicate key. Query guts redacted for readbility, I can include those if necessary.

I, [2021-08-29T20:23:37.257772 #1] INFO – : > cd /var/www/discourse && su discourse -c ‘bundle exec rake db:migrate’
2021-08-29 20:23:42.937 UTC [3996] discourse@discourse ERROR: duplicate key value violates unique constraint “data_explorer_queries_pkey”
2021-08-29 20:23:42.937 UTC [3996] discourse@discourse DETAIL: Key (id)=(-2) already exists.
2021-08-29 20:23:42.937 UTC [3996] discourse@discourse STATEMENT: INSERT INTO
[…]
FROM plugin_store_rows
WHERE plugin_name = ‘discourse-data-explorer’ AND type_name = ‘JSON’

rake aborted!
StandardError: An error has occurred, this and all later migrations canceled:

ERROR: duplicate key value violates unique constraint “data_explorer_queries_pkey”
DETAIL: Key (id)=(-2) already exists.

try this

1 Like

Yes, that works (as you have seen).

Something went wrong during the upgrade itself.

Just post your needs in #marketplace or look through it to see who regularly respond there.

all you need to do is, once discourse is up and running again.

  • backup discourse.
  • download the backup
  • update the droplet
  • rebuild the app

then you should be good to go for another few years

Awesome thank you.

If I’m understanding this correctly, it looks like there’s a database table used by a plugin called “data explorer” which has a duplicate row in it, and deleting that duplicate row allows the rebuild script to proceed. From that thread, it also sounds like this error — or similar ones — have happened before, and there may have been updates to Discourse to prevent this from occurring in the future. Meaning, future upgrades we run may not encounter this.

Let me know if you think I’m reading that correctly!

You’re referring to the Ubuntu upgrade? If so, gotcha.

Correct

Correct again :slight_smile:

yes :smiley:

Ok! I have my marching orders. Thanks much for the speedy feedback.

1 Like

Just to put a cap on this, for anyone who encounters this error, it looks like we had a couple things go wrong.

As far as the upgrade itself, the duplicate row in the “data_explorer” plugin was causing an issue and needed to be removed.

However as part of this we also ran an OS upgrade from Ubuntu 16 to 20, and this caused a networking error, rendering the Digital Ocean droplet unreachable on reboot. Specifically, it looks like as the OS moved to the new “Netplan” setup, something in the config was preventing the network interface boot script from completing at the point when it ran. So the Droplet came up, just not the network. Going in afterward, via the in-browser recovery console, and running the network interface script once more brought up the interface and allowed us to reach it from the outside again. So long as we don’t have to reboot the droplet, we’ll be good until we can find some time to power down and test a fix for the netplan configuration.

I know this is probably an edge case, but I remember reading somewhere that OS upgrades generally don’t cause issues. And while that is probably true most of the time, that wasn’t the case here, and kept us offline for a good half a day until the time zones aligned to where our technical hired muscle could get us back up again.

So, all resolved for now. Thanks for the speedy input everyone.

3 Likes