Problem upgrading Discourse

Hi,

I’m upgrading Discourse v2.3.0.beta8 +212 --> 2.4.0.beta1.

First I upgraded the docker manager from the Web UI. Then the web UI told me I needed to upgrade in the command line, so I did that.

I’ve had repeated errors when upgrading. I run

cd /var/discourse
su ./launcher rebuild app

it runs for a few minutes, then fails on a database upgrade. I rebooted my server, which brought discourse back up (but not upgraded) and tried again. Same error.

Any suggestions for proceeding?

Here’s the last set of lines when I run the rebuild.

Optimizing site icons...
I, [2019-07-09T01:22:18.589503 #13]  INFO -- : Terminating async processes
I, [2019-07-09T01:22:18.589624 #13]  INFO -- : Sending INT to HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/10/bin/postmaster -D /etc/postgresql/10/main pid: 67
I, [2019-07-09T01:22:18.589816 #13]  INFO -- : Sending TERM to exec chpst -u redis -U redis /usr/bin/redis-server /etc/redis/redis.conf pid: 183
2019-07-09 01:22:18.589 UTC [67] LOG:  received fast shutdown request
183:signal-handler (1562635338) Received SIGTERM scheduling shutdown...
2019-07-09 01:22:18.593 UTC [67] LOG:  aborting any active transactions
2019-07-09 01:22:18.599 UTC [67] LOG:  worker process: logical replication launcher (PID 76) exited with exit code 1
2019-07-09 01:22:18.599 UTC [71] LOG:  shutting down
2019-07-09 01:22:18.629 UTC [67] LOG:  database system is shut down
183:M 09 Jul 2019 01:22:18.645 # User requested shutdown...
183:M 09 Jul 2019 01:22:18.645 * Saving the final RDB snapshot before exiting.
183:M 09 Jul 2019 01:22:18.672 * DB saved on disk
183:M 09 Jul 2019 01:22:18.672 # Redis is now ready to exit, bye bye...


FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate' failed with return #<Process::Status: pid 366 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"db_migrate", "cmd"=>["su discourse -c 'bundle exec rake db:migrate'"]}
cbaaf74d12f5c22faf7f054d391f3570b5e7d8dd3b8bce421c57ef17c4b43c55
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one

Edit: The only errors in the full log are these

I, [2019-07-09T01:21:35.162142 #13]  INFO -- : > su postgres -c 'createdb discourse' || true
2019-07-09 01:21:35.330 UTC [80] postgres@postgres ERROR:  database "discourse" already exists
2019-07-09 01:21:35.330 UTC [80] postgres@postgres STATEMENT:  CREATE DATABASE discourse;
createdb: database creation failed: ERROR:  database "discourse" already exists
I, [2019-07-09T01:21:35.332706 #13]  INFO -- :
I, [2019-07-09T01:21:35.333101 #13]  INFO -- : > su postgres -c 'psql discourse -c "create user discourse;"' || true
2019-07-09 01:21:35.444 UTC [91] postgres@discourse ERROR:  role "discourse" already exists
2019-07-09 01:21:35.444 UTC [91] postgres@discourse STATEMENT:  create user discourse;
ERROR:  role "discourse" already exists

I notice it quits above after “Optimizing Site Icons…” – maybe there’s a problem here?

You might try searching for “error role discourse already exists”

searched by that term, didn’t find anything that helped

  • some posts mentioned plugins, I disabled plugins in app.yml
  • I had a message about deprecated version of docker, upgraded that
  • ran discourse doctor

same errors.

Attached is the output of launch rebuild.

Any suggestions for where to go next?

rebuild script.txt (140.9 KB)

It’s worth noting I have a relative url root in my app.xml. Could that be messing up the upgrade?

env:
  DISCOURSE_RELATIVE_URL_ROOT: /epicenter/support

run:
  - exec: echo "Beginning of custom commands"

  - exec:
        cd: $home
        cmd:
          - mkdir -p public/epicenter/support
          - cd public/epicenter/support && ln -s ../../uploads && ln -s ../../backups
          - rm public/uploads
          - rm public/backups
  - replace:
       global: true
       filename: /etc/nginx/conf.d/discourse.conf
       from: proxy_pass http://discourse;
       to: |
          rewrite ^/(.*)$ /epicenter/support/$1 break;
          proxy_pass http://discourse;
  - replace:
       filename: /etc/nginx/conf.d/discourse.conf
       from: etag off;
       to: |
          etag off;
          location /epicenter/support {
             rewrite ^/epicenter/support/?(.*)$ /$1;
          }
  - replace:
         filename: /etc/nginx/conf.d/discourse.conf
         from: $proxy_add_x_forwarded_for
         to: $http_fastly_client_ip
         global: true

Finally got this working on my third or fourth work session. The issue appeared to be missing images from “uploads” folder. The solution was to make a new installation, use the same “app.yml” file, and restore from backup with dummy files for missing images.

In parallel to the original problem, I noticed after a previous upgrade various icons and images disappeared. When I tried to rebuild, the logs showed that the it quites after “optimizing site images”. I think it must have gotten stuck on a missed image and quit without logging that specific error. (there was no indication that missing images was the problem or what image files were missing).

In the end, I made a new Discourse install with the latest version. I restored from backup following directions here. It took me three tries.

First, the backup script errored out looking for uploaded files, so I copied in the uploads/default folder from my previous backed up files.

I ran the restore script again. This time it gave error that it couldn’t find a specific image file. I made a fake image file, gave it the same name and put it in the specified spot.

Ran the restore script a third time. Voila! My site was restored from backup and on the latest version.

1 Like