Site offline after rebuild (4th Feb 2025)

I saw this message after a recent rebuild. I then ran ./launcher rebuild app, but afterward, my instance became inaccessible. It’s a standard install—how can I determine what happened?

errors when I run ./launcher logs app

cd /var/discourse
./launcher logs app
x86_64 arch detected.
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
run-parts: executing /etc/runit/1.d/letsencrypt
[Tue Feb  4 05:38:16 PM UTC 2025] Domains not changed.
[Tue Feb  4 05:38:16 PM UTC 2025] Skip, Next renewal time is: 2025-03-02T20:15:28Z
[Tue Feb  4 05:38:16 PM UTC 2025] Add '--force' to force to renew.
[Tue Feb  4 05:38:17 PM UTC 2025] Installing key to: /shared/ssl/mydomain.com.key
[Tue Feb  4 05:38:17 PM UTC 2025] Installing full chain to: /shared/ssl/mydomain.com.cer
[Tue Feb  4 05:38:17 PM UTC 2025] Run reload cmd: sv reload nginx
warning: nginx: unable to open supervise/ok: file does not exist
[Tue Feb  4 05:38:17 PM UTC 2025] Reload error for :
[Tue Feb  4 05:38:17 PM UTC 2025] Domains not changed.
[Tue Feb  4 05:38:17 PM UTC 2025] Skip, Next renewal time is: 2025-03-02T20:15:33Z
[Tue Feb  4 05:38:17 PM UTC 2025] Add '--force' to force to renew.
[Tue Feb  4 05:38:18 PM UTC 2025] Installing key to: /shared/ssl/mydomain.com_ecc.key
[Tue Feb  4 05:38:18 PM UTC 2025] Installing full chain to: /shared/ssl/mydomain.com_ecc.cer
[Tue Feb  4 05:38:18 PM UTC 2025] Run reload cmd: sv reload nginx
warning: nginx: unable to open supervise/ok: file does not exist
[Tue Feb  4 05:38:18 PM UTC 2025] Reload error for :
Started runsvdir, PID is 567
ok: run: redis: (pid 577) 0s
ok: run: postgres: (pid 581) 0s
nginx: [warn] duplicate extension "wasm", content type: "application/wasm", previous content type: "application/wasm" in /etc/nginx/conf.d/discourse.conf:4
supervisor pid: 575 unicorn pid: 607
Shutting Down
run-parts: executing /etc/runit/3.d/01-nginx
ok: down: nginx: 1s, normally up
run-parts: executing /etc/runit/3.d/02-unicorn
(575) exiting
ok: down: unicorn: 0s, normally up
run-parts: executing /etc/runit/3.d/10-redis
ok: down: redis: 1s, normally up
run-parts: executing /etc/runit/3.d/99-postgres
ok: down: postgres: 0s, normally up
ok: down: nginx: 5s, normally up
ok: down: postgres: 1s, normally up
ok: down: redis: 3s, normally up
ok: down: cron: 0s, normally up
ok: down: unicorn: 4s, normally up
ok: down: rsyslog: 0s, normally up
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
run-parts: executing /etc/runit/1.d/letsencrypt
[Tue Feb  4 05:58:32 PM UTC 2025] Domains not changed.
[Tue Feb  4 05:58:32 PM UTC 2025] Skip, Next renewal time is: 2025-03-02T20:15:28Z
[Tue Feb  4 05:58:32 PM UTC 2025] Add '--force' to force to renew.
[Tue Feb  4 05:58:32 PM UTC 2025] Installing key to: /shared/ssl/mydomain.com.key
[Tue Feb  4 05:58:32 PM UTC 2025] Installing full chain to: /shared/ssl/mydomain.com.cer
[Tue Feb  4 05:58:32 PM UTC 2025] Run reload cmd: sv reload nginx
fail: nginx: runsv not running
[Tue Feb  4 05:58:32 PM UTC 2025] Reload error for :
[Tue Feb  4 05:58:32 PM UTC 2025] Domains not changed.
[Tue Feb  4 05:58:32 PM UTC 2025] Skip, Next renewal time is: 2025-03-02T20:15:33Z
[Tue Feb  4 05:58:32 PM UTC 2025] Add '--force' to force to renew.
[Tue Feb  4 05:58:32 PM UTC 2025] Installing key to: /shared/ssl/mydomain.com_ecc.key
[Tue Feb  4 05:58:32 PM UTC 2025] Installing full chain to: /shared/ssl/mydomain.com_ecc.cer
[Tue Feb  4 05:58:32 PM UTC 2025] Run reload cmd: sv reload nginx
fail: nginx: runsv not running
[Tue Feb  4 05:58:32 PM UTC 2025] Reload error for :
Started runsvdir, PID is 561
ok: run: redis: (pid 575) 0s
nginx: [warn] duplicate extension "wasm", content type: "application/wasm", previous content type: "application/wasm" in /etc/nginx/conf.d/discourse.conf:4
ok: run: postgres: (pid 580) 1s
supervisor pid: 570 unicorn pid: 601
Shutting Down
run-parts: executing /etc/runit/3.d/01-nginx
ok: down: nginx: 0s, normally up
run-parts: executing /etc/runit/3.d/02-unicorn
(570) exiting
ok: down: unicorn: 1s, normally up
run-parts: executing /etc/runit/3.d/10-redis
ok: down: redis: 0s, normally up
run-parts: executing /etc/runit/3.d/99-postgres
ok: down: postgres: 0s, normally up
ok: down: nginx: 3s, normally up
ok: down: postgres: 1s, normally up
ok: down: redis: 1s, normally up
ok: down: cron: 0s, normally up
ok: down: unicorn: 3s, normally up
ok: down: rsyslog: 0s, normally up
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
run-parts: executing /etc/runit/1.d/letsencrypt
[Tue Feb  4 06:01:07 PM UTC 2025] Domains not changed.
[Tue Feb  4 06:01:07 PM UTC 2025] Skip, Next renewal time is: 2025-03-02T20:15:28Z
[Tue Feb  4 06:01:07 PM UTC 2025] Add '--force' to force to renew.
[Tue Feb  4 06:01:07 PM UTC 2025] Installing key to: /shared/ssl/mydomain.com.key
[Tue Feb  4 06:01:07 PM UTC 2025] Installing full chain to: /shared/ssl/mydomain.com.cer
[Tue Feb  4 06:01:07 PM UTC 2025] Run reload cmd: sv reload nginx
fail: nginx: runsv not running
[Tue Feb  4 06:01:07 PM UTC 2025] Reload error for :
[Tue Feb  4 06:01:07 PM UTC 2025] Domains not changed.
[Tue Feb  4 06:01:07 PM UTC 2025] Skip, Next renewal time is: 2025-03-02T20:15:33Z
[Tue Feb  4 06:01:07 PM UTC 2025] Add '--force' to force to renew.
[Tue Feb  4 06:01:07 PM UTC 2025] Installing key to: /shared/ssl/mydomain.com_ecc.key
[Tue Feb  4 06:01:07 PM UTC 2025] Installing full chain to: /shared/ssl/mydomain.com_ecc.cer
[Tue Feb  4 06:01:07 PM UTC 2025] Run reload cmd: sv reload nginx
fail: nginx: runsv not running
[Tue Feb  4 06:01:07 PM UTC 2025] Reload error for :
Started runsvdir, PID is 561
ok: run: redis: (pid 575) 0s
ok: run: postgres: (pid 576) 0s
nginx: [warn] duplicate extension "wasm", content type: "application/wasm", previous content type: "application/wasm" in /etc/nginx/conf.d/discourse.conf:4
supervisor pid: 570 unicorn pid: 601
(570) exiting
nginx: [warn] duplicate extension "wasm", content type: "application/wasm", previous content type: "application/wasm" in /etc/nginx/conf.d/discourse.conf:4
3 Likes

Everything was going smoothly. I saw the following and rebuilt. The build completed without errors, but my site won’t open.

-------------------------------------------------------------------------------------
UPGRADE OF POSTGRES COMPLETE

Old 13 database is stored at /shared/postgres_data_old

To complete the upgrade, rebuild again using:

./launcher rebuild app
-------------------------------------------------------------------------------------

When I run this
tail /var/discourse/shared/standalone/log/var-log/postgres/current
Output

2025-02-04 18:11:50.943 UTC [573] LOG:  shutting down
2025-02-04 18:11:50.945 UTC [573] LOG:  checkpoint starting: shutdown immediate
2025-02-04 18:11:50.970 UTC [573] LOG:  checkpoint complete: wrote 139 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.017 s, sync=0.005 s, total=0.027 s; sync files=27, longest=0.002 s, average=0.001 s; distance=410 kB, estimate=410 kB
2025-02-04 18:11:51.034 UTC [547] LOG:  database system is shut down
2025-02-04 18:15:04.302 UTC [548] LOG:  starting PostgreSQL 15.10 (Debian 15.10-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-02-04 18:15:04.303 UTC [548] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2025-02-04 18:15:04.303 UTC [548] LOG:  listening on IPv6 address "::", port 5432
2025-02-04 18:15:04.305 UTC [548] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-02-04 18:15:04.313 UTC [575] LOG:  database system was shut down at 2025-02-04 18:14:37 UTC
2025-02-04 18:15:04.318 UTC [548] LOG:  database system is ready to accept connections

Also ./launcher logs app gives the following output

x86_64 arch detected.
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
run-parts: executing /etc/runit/1.d/letsencrypt
[Tue Feb  4 06:15:03 PM UTC 2025] Domains not changed.
[Tue Feb  4 06:15:03 PM UTC 2025] Skip, Next renewal time is: 2025-02-09T00:30:10Z
[Tue Feb  4 06:15:03 PM UTC 2025] Add '--force' to force to renew.
[Tue Feb  4 06:15:03 PM UTC 2025] Installing key to: /shared/ssl/forum.myforum.com.key
[Tue Feb  4 06:15:03 PM UTC 2025] Installing full chain to: /shared/ssl/forum.myforum.com.cer
[Tue Feb  4 06:15:03 PM UTC 2025] Run reload cmd: sv reload nginx
warning: nginx: unable to open supervise/ok: file does not exist
[Tue Feb  4 06:15:03 PM UTC 2025] Reload error for :
[Tue Feb  4 06:15:03 PM UTC 2025] Domains not changed.
[Tue Feb  4 06:15:03 PM UTC 2025] Skip, Next renewal time is: 2025-02-09T00:30:15Z
[Tue Feb  4 06:15:03 PM UTC 2025] Add '--force' to force to renew.
[Tue Feb  4 06:15:04 PM UTC 2025] Installing key to: /shared/ssl/forum.myforum.com_ecc.key
[Tue Feb  4 06:15:04 PM UTC 2025] Installing full chain to: /shared/ssl/forum.myforum.com_ecc.cer
[Tue Feb  4 06:15:04 PM UTC 2025] Run reload cmd: sv reload nginx
warning: nginx: unable to open supervise/ok: file does not exist
[Tue Feb  4 06:15:04 PM UTC 2025] Reload error for :
Started runsvdir, PID is 537
ok: run: redis: (pid 552) 0s
ok: run: postgres: (pid 548) 0s
nginx: [warn] duplicate extension "wasm", content type: "application/wasm", previous content type: "application/wasm" in /etc/nginx/conf.d/discourse.conf:4
supervisor pid: 546 unicorn pid: 579
2 Likes

This is happening on my two self-hosted sites as well,after updating from the command line today. These are very vanilla installs with no customization or unofficial plugins, kept up to date regularly, and typically update without difficulty.

I am currently trying @mwaniki’s suggestions above and will see how it works out, and report back here.

2 Likes

I performed another rebuild of the app, and the update was successfully completed, but even though no errors are visible, the site cannot be reached. Any idea?

./launcher logs app


WARNING: Docker version 20.10.12 deprecated, recommend upgrade to 24.0.7 or newer.
x86_64 arch detected.
WARNING: containers/app.yml file is world-readable. You can secure this file by running: chmod o-rwx containers/app.yml
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
run-parts: executing /etc/runit/1.d/letsencrypt
[Tue Feb  4 07:12:15 PM UTC 2025] Domains not changed.
[Tue Feb  4 07:12:15 PM UTC 2025] Skip, Next renewal time is: 2025-03-06T00:39:07Z
[Tue Feb  4 07:12:15 PM UTC 2025] Add '--force' to force to renew.
[Tue Feb  4 07:12:16 PM UTC 2025] Installing key to: /shared/ssl/forum.******.com.key
[Tue Feb  4 07:12:16 PM UTC 2025] Installing full chain to: /shared/ssl/forum.*****.com.cer
[Tue Feb  4 07:12:16 PM UTC 2025] Run reload cmd: sv reload nginx
warning: nginx: unable to open supervise/ok: file does not exist
[Tue Feb  4 07:12:16 PM UTC 2025] Reload error for :
[Tue Feb  4 07:12:16 PM UTC 2025] Domains not changed.
[Tue Feb  4 07:12:16 PM UTC 2025] Skip, Next renewal time is: 2025-03-06T00:39:11Z
[Tue Feb  4 07:12:16 PM UTC 2025] Add '--force' to force to renew.
[Tue Feb  4 07:12:16 PM UTC 2025] Installing key to: /shared/ssl/forum.*****.com_ecc.key
[Tue Feb  4 07:12:16 PM UTC 2025] Installing full chain to: /shared/ssl/forum.ü_ecc.cer
[Tue Feb  4 07:12:16 PM UTC 2025] Run reload cmd: sv reload nginx
warning: nginx: unable to open supervise/ok: file does not exist
[Tue Feb  4 07:12:16 PM UTC 2025] Reload error for :
Started runsvdir, PID is 535
ok: run: redis: (pid 545) 0s
nginx: [warn] duplicate extension "wasm", content type: "application/wasm", previous content type: "application/wasm" in /etc/nginx/conf.d/discourse.conf:4
ok: run: postgres: (pid 548) 0s
supervisor pid: 542 unicorn pid: 575
2 Likes

I think this “site not responding at all” is unrelated to the postgres update. Looking into it right now :eyes:

5 Likes

I am eagerly waiting for this problem to be solved. I am receiving hundreds of emails from my users asking why I cannot access the forum :frowning:

4 Likes

Sorry for the disruption everyone! The fix is now live, so running another ./launcher rebuild app should bring things back online.

Please let us know if you’re still seeing any issues after that.

12 Likes

at work :slight_smile:

4 Likes

yuppy is working the forum is back to life. Thank you for the quick solution. That’s why we love Discourse :heart:

4 Likes

Thanks for fixing this! I also spent the last 2+ hours trying to figure out what went wrong with my rebuild. All is good now!

A meta-observation: when researching the issue in this forum, the default “Relevance” sorting of forum search results worked against me. The error logs on which I searched were exactly the same as here, but this recent topic only appeared many pages down in the results list (probably due to its recency). Hence, I didn’t find this topic until I haphazardly opened the Meta front page where it was trending. I guess this is a note to self/others to also check the front page or most recent results while trying to research a future rebuild issue.

6 Likes

That’s great feedback! Note that initially this convo was happening in PostgreSQL 15 update until David realized it was not related to the PostgreSQL update and moved the relevant posts to a new topic. That only happened about an hour ago, so you would not have been able to find it until after that!

Troubleshooting failed updates is notoriously hard - all the more so because updating Discourse is usually so smooth so most of us self hosters don’t have to learn how Discourse works under the hood and learn troubleshooting steps!

Appreciate you @david for looking into this and finding the fix so quickly!

3 Likes

I updated docker from the mobile app, and then got the dreaded “go update via console” message, which usually signals challenges ahead.

I’ve followed all the manual update steps, and launcher rebuild app fails each time.

I’m able to recover with a launcher start app, so the site is working.

It’s unclear to me if this is related to the postgres errors, or where I might be having issues.

2 Likes

That suggests that it’s not the same issue that’s been discussed in this topic.
For this issue, the rebuild was succeeding without error, but the site was failing to load. ./launcher start did not help.

So I’d suggest opening a fresh Support topic with details about the error you’re seeing.

2 Likes

My site is back online after the rebuild. Thanks! :+1:

4 Likes

Ultimately a reading comprehension problem on my part, I can affirm the issue was the Postgres database not shutting down properly. Once I followed the right directions, every back and working. Thanks all, it’s nice to have a place where when things go sideways and I panic a bit that there are cooler heads here to help.

Thank you!!!

4 Likes