Upgrade to 3.3 is failing for me

I tried upgrading using the UI, and that failed as described at:

So I restored by droplet from a backup, and then tried upgrading manually.

cd /var/discourse
git pull
./launcher rebuild app

which gives this:

WARNING: Docker version 20.10.7 deprecated, recommend upgrade to 24.0.7 or newer.
x86_64 arch detected.

WARNING: We are about to start downloading the Discourse base image
This process may take anywhere between a few minutes to an hour, depending on your network speed

Please be patient

2.0.20240825-0027: Pulling from discourse/base
e4fff0779e6d: Pulling fs layer 
04dda0e597e7: Pulling fs layer 
0b0ac7902d91: Pulling fs layer 
1ea0327cd622: Waiting 
459f11cf96b2: Waiting 
cd49b55154ee: Waiting 
4f4fb700ef54: Pull complete 
890a63bee26b: Pull complete 
1d239a1092e9: Pull complete 
7439767d748f: Pull complete 
19e63282f9d1: Pull complete 
6da4866029f1: Pull complete 
3274548c87f4: Pull complete 
fb2589b81eef: Pull complete 
da453ab7ba03: Pull complete 
260969aca4e8: Pull complete 
0c7927423a10: Pull complete 
cfdfd8bdc03e: Pull complete 
f837c184a2c0: Pull complete 
d14903daf553: Pull complete 
01422fc4dc02: Pull complete 
e918b15c8f19: Pull complete 
3202b43401af: Pull complete 
3fa0a48e923e: Pull complete 
2f1f96b416a1: Pull complete 
b5376d8069b5: Pull complete 
259e102648be: Pull complete 
807236570b2a: Pull complete 
e98845c05b05: Pull complete 
578a5e3e249f: Pull complete 
6b0bf88c86e8: Pull complete 
9551a14ee15e: Pull complete 
8bbcc4c7a11d: Pull complete 
5aff35532071: Pull complete 
f73f45300530: Pull complete 
42888ce727c0: Pull complete 
e8467a663928: Pull complete 
d2fb91f4643c: Pull complete 
88fc9778a448: Pull complete 
2a19d28a5a17: Pull complete 
6a2d56837370: Pull complete 
933885f686e0: Pull complete 
aecf6df6a6bb: Pull complete 
33fcdcfe61e2: Pull complete 
12726a4d34c8: Pull complete 
Digest: sha256:6de68cb49198b5281f79ed9401b3fe818c854d220dcf0238549fe2f2adb19146
Status: Downloaded newer image for discourse/base:2.0.20240825-0027
docker.io/discourse/base:2.0.20240825-0027
WARNING: containers/app.yml file is world-readable. You can secure this file by running: chmod o-rwx containers/app.yml
Ensuring launcher is up to date
Fetching origin
Launcher is up-to-date
Stopping old container
+ /usr/bin/docker stop -t 600 app
app
2.0.20240825-0027: Pulling from discourse/base
Digest: sha256:6de68cb49198b5281f79ed9401b3fe818c854d220dcf0238549fe2f2adb19146
Status: Image is up to date for discourse/base:2.0.20240825-0027
docker.io/discourse/base:2.0.20240825-0027
/usr/local/lib/ruby/gems/3.3.0/gems/pups-1.2.1/lib/pups.rb
/usr/local/bin/pups --stdin
I, [2024-10-15T06:14:37.390458 #1]  INFO -- : Reading from stdin
I, [2024-10-15T06:14:37.395803 #1]  INFO -- : > echo cron is now included in base image, remove from templates
I, [2024-10-15T06:14:37.398391 #1]  INFO -- : cron is now included in base image, remove from templates

I, [2024-10-15T06:14:37.408024 #1]  INFO -- : File > /etc/service/postgres/run  chmod: +x  chown: 
I, [2024-10-15T06:14:37.412237 #1]  INFO -- : File > /etc/service/postgres/log/run  chmod: +x  chown: 
I, [2024-10-15T06:14:37.416506 #1]  INFO -- : File > /etc/runit/3.d/99-postgres  chmod: +x  chown: 
I, [2024-10-15T06:14:37.420758 #1]  INFO -- : File > /root/install_postgres  chmod: +x  chown: 
I, [2024-10-15T06:14:37.424824 #1]  INFO -- : File > /root/upgrade_postgres  chmod: +x  chown: 
I, [2024-10-15T06:14:37.425837 #1]  INFO -- : Replacing data_directory = '/var/lib/postgresql/13/main' with data_directory = '/shared/postgres_data' in /etc/postgresql/13/main/postgresql.conf
I, [2024-10-15T06:14:37.426590 #1]  INFO -- : Replacing (?-mix:#?listen_addresses *=.*) with listen_addresses = '*' in /etc/postgresql/13/main/postgresql.conf
I, [2024-10-15T06:14:37.427073 #1]  INFO -- : Replacing (?-mix:#?synchronous_commit *=.*) with synchronous_commit = $db_synchronous_commit in /etc/postgresql/13/main/postgresql.conf
I, [2024-10-15T06:14:37.427713 #1]  INFO -- : Replacing (?-mix:#?shared_buffers *=.*) with shared_buffers = $db_shared_buffers in /etc/postgresql/13/main/postgresql.conf
I, [2024-10-15T06:14:37.428194 #1]  INFO -- : Replacing (?-mix:#?work_mem *=.*) with work_mem = $db_work_mem in /etc/postgresql/13/main/postgresql.conf
I, [2024-10-15T06:14:37.428633 #1]  INFO -- : Replacing (?-mix:#?default_text_search_config *=.*) with default_text_search_config = '$db_default_text_search_config' in /etc/postgresql/13/main/postgresql.conf
I, [2024-10-15T06:14:37.429175 #1]  INFO -- : Replacing (?-mix:#?checkpoint_segments *=.*) with checkpoint_segments = $db_checkpoint_segments in /etc/postgresql/13/main/postgresql.conf
I, [2024-10-15T06:14:37.429569 #1]  INFO -- : Replacing (?-mix:#?logging_collector *=.*) with logging_collector = $db_logging_collector in /etc/postgresql/13/main/postgresql.conf
I, [2024-10-15T06:14:37.430001 #1]  INFO -- : Replacing (?-mix:#?log_min_duration_statement *=.*) with log_min_duration_statement = $db_log_min_duration_statement in /etc/postgresql/13/main/postgresql.conf
I, [2024-10-15T06:14:37.430562 #1]  INFO -- : Replacing (?-mix:^#local +replication +postgres +peer$) with local replication postgres  peer in /etc/postgresql/13/main/pg_hba.conf
I, [2024-10-15T06:14:37.430964 #1]  INFO -- : Replacing (?-mix:^host.*all.*all.*127.*$) with host all all 0.0.0.0/0 md5 in /etc/postgresql/13/main/pg_hba.conf
I, [2024-10-15T06:14:37.431353 #1]  INFO -- : Replacing (?-mix:^host.*all.*all.*::1\/128.*$) with host all all ::/0 md5 in /etc/postgresql/13/main/pg_hba.conf
I, [2024-10-15T06:14:37.431673 #1]  INFO -- : > if [ -f /root/install_postgres ]; then
  /root/install_postgres && rm -f /root/install_postgres
elif [ -e /shared/postgres_run/.s.PGSQL.5432 ]; then
  socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGSQL.5432 || exit 0 && echo postgres already running stop container ; exit 1
fi

I, [2024-10-15T06:14:37.974529 #1]  INFO -- : Generating locales (this might take a while)...
Generation complete.

I, [2024-10-15T06:14:37.975013 #1]  INFO -- : > HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/13/bin/postmaster -D /etc/postgresql/13/main
I, [2024-10-15T06:14:37.976577 #1]  INFO -- : Terminating async processes
2024-10-15 06:14:38.136 UTC [36] LOG:  starting PostgreSQL 13.16 (Debian 13.16-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2024-10-15 06:14:38.138 UTC [36] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2024-10-15 06:14:38.139 UTC [36] LOG:  listening on IPv6 address "::", port 5432
2024-10-15 06:14:38.143 UTC [36] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-10-15 06:14:38.154 UTC [37] LOG:  database system was shut down at 2024-10-15 06:14:28 UTC
2024-10-15 06:14:38.176 UTC [36] LOG:  database system is ready to accept connections

at which point the upgrade stops and nothing further happens, and the server is offline (no web connection at all).

What should I try next?

UI upgrades require more memory I believe because you are trying to run the site and rebuilding simultaneously.

Make sure you have enough swap to cope - at least as much as your RAM.

So on 4GB server make sure you have 4GB swap.

Btw the git pull here is redundant. The build script does this for you.

1 Like

My server has 4GB of RAM, and 4GB of Swap. The rebuild just stops at the database system is ready to accept connections and does not complete and the forum remains down. I can restore it to the Droplet backup (again) which will take me back to a working 3.2 forum, but it’d be better to resolve this.

# free
              total        used        free      shared  buff/cache   available
Mem:           3919         286        1443          20        2189        3360
Swap:          4095           2        4093
Total:         8015         288        5537
2 Likes

In these circumstances I usually reboot (at my own risk) but that’s not yet failed me. At least that gets the site back up.

Make sure your OS is reasonably up to date LTS and that docker warning seems worth taking note of.

Monitor your memory during rebuild with htop to be sure?

Looks like loads of swap though. :+1:

2 Likes

My OS is Ubuntu 20.04.6 LTS.

Huh. OK, I rebooted and the forum came back up.

Ahh, but it still says it is 3.2.4 installed, latest is 3.3.2, so it is not updated.

So I’m not sure where that leaves me. I’ll try the update again.

Yep, just stops at the same place. I presume a newer version of Docker would require updating the major version of Ubuntu, which I guess I could do, I just wasn’t planning to do it at this time.

I would update your OS and docker just to be sure.

There are sometimes big delays in the build script at various points. How long is how long?

2 Likes

It was stuck at that point when I came back after 15-20 minutes. Since then I haven’t waited particularly long. It’s possible it is just doing something, but there isn’t any evidence of that. Nothing is using any appreciable CPU.

1 Like

I think you have two choices at this point:

  • In place OS and docker upgrade - then retry
  • New Droplet.

Latter might end up being quicker.

Well, I’ve got a current snapshot, I suppose doing an Ubuntu upgrade wont hurt at this point, I can always undo it all.

1 Like

Ahh, it refuses because

Sorry, this storage driver is not supported in kernels for newer 
releases 

There will not be any further Ubuntu releases that provide kernel 
support for the aufs storage driver. 

Please ensure that none of your containers are using the aufs storage 
driver, remove the directory /var/lib/docker/aufs and try again. 

Sigh. Nothing is ever easy is it?

2 Likes

New Droplet. :).

(And don’t forget to recreate swap on new server if not already created)

3 Likes

OK, well that whole upgrade process can go down as a bit of an unmitigated disaster.

I will revert the droplet to what I had before (Ubuntu 20.04.6 LTS and Discord 3.2.4) and put my head in the sand and forget all about 3.3 and try again another day.

Thanks for trying to help.

2 Likes

Bonus feature - when I restore the droplet, for some reason I get logged out - and you can’t log in, even as admin while the site is in read only mode!

1 Like

It just occurs to me, I cannot switch Docker over to overlay2 as described at (Change the Docker storage backend) in order to be able to update Ubunto because switching to overlay will require ./launcher rebuild app which fails (unless switching it to overlay2 resolves the issue but that alone seems unlikely). So migrating to a new droplet seems like the only plausible way forward, though that will then presumably require DNS changes which is generally quite slow. Ugh.

1 Like

You might be able to restart the container with

  docker start app

You might also be able to solve your problem with

 apt install docker-ce docker-ce-cli

Not sure about the issue with the overlay. Have you done an os upgrade from an older Ubuntu at some point?

If you’re on digital ocean you can create a static ip pointing to the old server and update dns to point to it. Then when you move to the new server there is no dns lag since you can redirect the ip to the new server.

2 Likes

Yes, that restarted the container, but without any upgrade.

I don’t know what happens next after database system is ready to accept connections in the upgrade process, but that’s as far as it gets and then never any further (unless the next step takes a very long time).

Yes, Ubuntu has been updated previously. So Docker is on aufs, always has been.

Is it possible to rebuild without upgrading? If that works, then possibly I can get the current Docker switched from aufs to overlay2, which would then mean potentially I can upgrade Ubunto to 22, which might mean that other things then work - but at the moment I have no idea why the upgrade stalls at that point, so it’s pretty much just hope as to what might resolve it.

No, it will upgrade to latest commit on the set branch.

Remember migration to a new server might take as little as 30 mins.

Yes, it looks like I’ll have to go that route - colour me sentimental but I didn’t really want to have to set up a whole new server (which runs a few other things as well as Discourse which will also have to be migrated) just to upgrade the forum.

Ahh well, such is the life of anyone running their own server.

2 Likes

Yes, it seems to just hang there for some reason. I don’t know why, but doing the docker upgrade, seems to have helped a few sites.

Did you try to upgrade docker as I mentioned above?

You could conceivably fix the overlay if you were to try searching elsewhere. That’s why I don’t trust OS upgrades.

I haven’t, but I’ll try that next - I’ll have to schedule another maintenance period - my process is always to shut the droplet down and do a snapshot which takes a fair while and is the largest part of the down time, but ensures I can very easily revert the process which has come in handy a couple times.

So I’ll schedule another maintenance window for Tuesday and then try updating docker (and then if that works I’ll try switching to overlay as well). And if it fails, I’ll move on to trying out a new server the following week (or maybe I’ll do that in parallel because I can set up a new server on a test droplet anyway).

Thanks.