Tentativo di recuperare un'installazione

Norike · 17 Aprile 2023, 9:12pm

Ho installato Discourse su una macchina CentOS seguendo la guida qui.
Nel caso possa aiutare, stiamo anche utilizzando NGINX Proxy Manager.

L’installazione ha funzionato per qualche settimana, finché non abbiamo dovuto riavviare la macchina. Poi non è più riuscita ad avviarsi. Ecco l’output di rebuild:

Ensuring launcher is up to date
Launcher is up-to-date
2.0.20230409-0052: Pulling from discourse/base
Digest: sha256:dd75ceb9322f79629f8b0bf78cfb0f79ad6bb366b7ead3e1cd32dcb8712ec46f
Status: Image is up to date for discourse/base:2.0.20230409-0052
docker.io/discourse/base:2.0.20230409-0052
/usr/local/lib/ruby/gems/3.2.0/gems/pups-1.1.1/lib/pups.rb
/usr/local/bin/pups --stdin
I, [2023-04-13T09:55:28.002076 #1]  INFO -- : Reading from stdin
I, [2023-04-13T09:55:28.005509 #1]  INFO -- : > locale-gen $LANG && update-locale
I, [2023-04-13T09:55:28.031953 #1]  INFO -- : Generating locales (this might take a while)...
Generation complete.

I, [2023-04-13T09:55:28.032080 #1]  INFO -- : > mkdir -p /shared/postgres_run
I, [2023-04-13T09:55:28.034322 #1]  INFO -- :
I, [2023-04-13T09:55:28.034470 #1]  INFO -- : > chown postgres:postgres /shared/postgres_run
I, [2023-04-13T09:55:28.036190 #1]  INFO -- :
I, [2023-04-13T09:55:28.036297 #1]  INFO -- : > chmod 775 /shared/postgres_run
I, [2023-04-13T09:55:28.037819 #1]  INFO -- :
I, [2023-04-13T09:55:28.037918 #1]  INFO -- : > rm -fr /var/run/postgresql
I, [2023-04-13T09:55:28.039663 #1]  INFO -- :
I, [2023-04-13T09:55:28.039770 #1]  INFO -- : > ln -s /shared/postgres_run /var/run/postgresql
I, [2023-04-13T09:55:28.041327 #1]  INFO -- :
I, [2023-04-13T09:55:28.041457 #1]  INFO -- : > socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGS
QL.5432 || exit 0 && echo postgres already running stop container ; exit 1
2023/04/13 09:55:28 socat[19] E connect(6, AF=1 "/shared/postgres_run/.s.PGSQL.5432", 36): No such file
or directory
I, [2023-04-13T09:55:28.045222 #1]  INFO -- :
I, [2023-04-13T09:55:28.045329 #1]  INFO -- : > rm -fr /shared/postgres_run/.s*
I, [2023-04-13T09:55:28.047498 #1]  INFO -- :
I, [2023-04-13T09:55:28.047590 #1]  INFO -- : > rm -fr /shared/postgres_run/*.pid
I, [2023-04-13T09:55:28.049691 #1]  INFO -- :
I, [2023-04-13T09:55:28.049815 #1]  INFO -- : > mkdir -p /shared/postgres_run/13-main.pg_stat_tmp
I, [2023-04-13T09:55:28.051658 #1]  INFO -- :
I, [2023-04-13T09:55:28.051800 #1]  INFO -- : > chown postgres:postgres /shared/postgres_run/13-main.pg_
stat_tmp
I, [2023-04-13T09:55:28.053477 #1]  INFO -- :
I, [2023-04-13T09:55:28.057301 #1]  INFO -- : File > /etc/service/postgres/run  chmod: +x  chown:
I, [2023-04-13T09:55:28.060969 #1]  INFO -- : File > /etc/service/postgres/log/run  chmod: +x  chown:
I, [2023-04-13T09:55:28.064663 #1]  INFO -- : File > /etc/runit/3.d/99-postgres  chmod: +x  chown:
I, [2023-04-13T09:55:28.068319 #1]  INFO -- : File > /root/upgrade_postgres  chmod: +x  chown:
I, [2023-04-13T09:55:28.068510 #1]  INFO -- : > chown -R root /var/lib/postgresql/13/main
I, [2023-04-13T09:55:37.064277 #1]  INFO -- :
I, [2023-04-13T09:55:37.064575 #1]  INFO -- : > [ ! -e /shared/postgres_data ] && install -d -m 0755 -o
I, [2023-04-13T09:55:37.066210 #1]  INFO -- :
I, [2023-04-13T09:55:37.066255 #1]  INFO -- : > chown -R postgres:postgres /shared/postgres_data
I, [2023-04-13T09:55:37.076706 #1]  INFO -- :
I, [2023-04-13T09:55:37.076786 #1]  INFO -- : > chown -R postgres:postgres /var/run/postgresql
I, [2023-04-13T09:55:37.078711 #1]  INFO -- :
I, [2023-04-13T09:55:37.078822 #1]  INFO -- : > /root/upgrade_postgres
I, [2023-04-13T09:55:37.082200 #1]  INFO -- :
I, [2023-04-13T09:55:37.082297 #1]  INFO -- : > rm /root/upgrade_postgres
I, [2023-04-13T09:55:37.083919 #1]  INFO -- :
I, [2023-04-13T09:55:37.084109 #1]  INFO -- : Replacing data_directory = '/var/lib/postgresql/13/main' w
I, [2023-04-13T09:55:37.084484 #1]  INFO -- : Replacing (?-mix:#?listen_addresses *=.*) with listen_addr
I, [2023-04-13T09:55:37.085049 #1]  INFO -- : Replacing (?-mix:#?synchronous_commit *=.*) with synchrono
I, [2023-04-13T09:55:37.085423 #1]  INFO -- : Replacing (?-mix:#?shared_buffers *=.*) with shared_buffer
I, [2023-04-13T09:55:37.085857 #1]  INFO -- : Replacing (?-mix:#?work_mem *=.*) with work_mem = $db_work
I, [2023-04-13T09:55:37.086303 #1]  INFO -- : Replacing (?-mix:#?default_text_search_config *=.*) with d
I, [2023-04-13T09:55:37.086614 #1]  INFO -- : > install -d -m 0755 -o postgres -g postgres /shared/postg
I, [2023-04-13T09:55:37.089011 #1]  INFO -- :
I, [2023-04-13T09:55:37.089245 #1]  INFO -- : Replacing (?-mix:#?checkpoint_segments *=.*) with checkpoi
I, [2023-04-13T09:55:37.089516 #1]  INFO -- : Replacing (?-mix:#?logging_collector *=.*) with logging_co
I, [2023-04-13T09:55:37.089887 #1]  INFO -- : Replacing (?-mix:#?log_min_duration_statement *=.*) with l
I, [2023-04-13T09:55:37.090322 #1]  INFO -- : Replacing (?-mix:^#local +replication +postgres +peer$) wi
I, [2023-04-13T09:55:37.090509 #1]  INFO -- : Replacing (?-mix:^host.*all.*all.*127.*$) with host all al
I, [2023-04-13T09:55:37.090946 #1]  INFO -- : Replacing (?-mix:^host.*all.*all.*::1\/128.*$) with host a
I, [2023-04-13T09:55:37.091279 #1]  INFO -- : > HOME=/var/lib/postgresql USER=postgres exec chpst -u pos
I, [2023-04-13T09:55:37.092562 #1]  INFO -- : > sleep 5
2023-04-13 09:55:37.239 UTC [42] LOG:  starting PostgreSQL 13.10 (Debian 13.10-1.pgdg110+1) on x86_64-pc
2023-04-13 09:55:37.240 UTC [42] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2023-04-13 09:55:37.240 UTC [42] LOG:  listening on IPv6 address "::", port 5432
2023-04-13 09:55:37.287 UTC [42] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-04-13 09:55:37.338 UTC [45] LOG:  database system was interrupted; last known up at 2023-04-13 08:3
2023-04-13 09:55:37.795 UTC [45] LOG:  invalid resource manager ID in primary checkpoint record
2023-04-13 09:55:37.795 UTC [45] PANIC:  could not locate a valid checkpoint record
2023-04-13 09:55:38.256 UTC [42] LOG:  startup process (PID 45) was terminated by signal 6: Aborted
2023-04-13 09:55:38.256 UTC [42] LOG:  aborting startup due to startup process failure
2023-04-13 09:55:38.283 UTC [42] LOG:  database system is shut down
I, [2023-04-13T09:55:42.094474 #1]  INFO -- :
I, [2023-04-13T09:55:42.094602 #1]  INFO -- : > su postgres -c 'createdb discourse' || true
createdb: error: could not connect to database template1: connection to server on socket "/var/run/postg
        Is the server running locally and accepting connections on that socket?
I, [2023-04-13T09:55:42.130700 #1]  INFO -- :
I, [2023-04-13T09:55:42.130828 #1]  INFO -- : > su postgres -c 'psql discourse -c "create user discourse
psql: error: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory
        Is the server running locally and accepting connections on that socket?
I, [2023-04-13T09:55:42.166116 #1]  INFO -- :
I, [2023-04-13T09:55:42.166242 #1]  INFO -- : > su postgres -c 'psql discourse -c "grant all privileges on database discourse to discourse;"' || true
psql: error: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory
        Is the server running locally and accepting connections on that socket?
I, [2023-04-13T09:55:42.201461 #1]  INFO -- :
I, [2023-04-13T09:55:42.201617 #1]  INFO -- : > su postgres -c 'psql discourse -c "alter schema public owner to discourse;"'
psql: error: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory
        Is the server running locally and accepting connections on that socket?
I, [2023-04-13T09:55:42.236768 #1]  INFO -- :
I, [2023-04-13T09:55:42.236996 #1]  INFO -- : > Terminating async processes

FAILED
--------------------
Pups::ExecError: su postgres -c 'psql discourse -c "alter schema public owner to discourse;"' failed with return #<Process::Status: pid 55 exit 2>
Location of failure: /usr/local/lib/ruby/gems/3.2.0/gems/pups-1.1.1/lib/pups/exec_command.rb:117:in `spawn'
exec failed with the params "su postgres -c 'psql $db_name -c \\\"alter schema public owner to $db_user;\\\"'"
bootstrap failed with exit code 2
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.
cd04fbb38f1ef61e418d680b969c2c056439e3b1dbfe64608524a8d3361cd91c

discourse-doctor non riesce a trovare l’applicazione docker in esecuzione:

==================== SERIOUS PROBLEM!!!! ====================
app not running!
Attempting to rebuild

E tenta di ricostruire, quindi in pratica termina con lo stesso log di cui sopra.

Grazie in anticipo!

shyguy · 18 Aprile 2023, 2:06am

PANIC: could not locate a valid checkpoint record

Penso che il tuo database sia corrotto. Secondo me, sarebbe meglio ripristinare da un backup. Penso che Discourse crei backup per impostazione predefinita? Cerca in discourse/shared/app/backups o qualcosa del genere.

Se non hai un backup di Discourse, allora questo diventa più un argomento di supporto PostgreSQL che di Discourse. Potrei provare ad aiutare, immagino, ma non sono un esperto.

Potrebbe essere necessario utilizzare questo:

Ed_S · 18 Aprile 2023, 9:09am

Sembra anche a me. Spero tu possa trovare una soluzione @Norike … nel frattempo trovo questo piuttosto preoccupante - credo che un riavvio avviato dal software dovrebbe lasciare il database in uno stato pulito. Si è trattato di un riavvio forzato, ovvero un ciclo di alimentazione?

Mi chiedo se faccia differenza su quale file system sia memorizzato il tuo database. Cosa dice df -T?

Norike · 18 Aprile 2023, 9:24am

Grazie!

Ho effettivamente trovato alcuni backup nella cartella /var/discourse/shared/standalone/backups/default, il che sembra promettente.
Tuttavia, secondo la guida Ripristinare un backup dalla riga di comando, il container deve essere in esecuzione per importare il database, ma il comando ./launcher enter app fallisce:

Rilevata architettura x86_64.
ATTENZIONE: il file containers/app.yml è leggibile da tutti. Puoi proteggere questo file eseguendo: chmod o-rwx containers/app.yml
Risposta di errore dal demone: Nessun container trovato: app

Suppongo che dovrei iniziare con un’installazione pulita. Esiste una guida per pulire e ricominciare da capo? Sarebbe corretto quanto segue?

Copiare i backup dalla cartella /var/discourse/shared/standalone/backups/default in una posizione diversa al di fuori della cartella /var/discourse
Rimuovere la cartella /var/discourse
Installare Discourse da zero seguendo la guida Installazione Cloud per avere una nuova istanza in esecuzione.
Ripristinare il backup con la GUI o con la guida da riga di comando sopra

Norike · 18 Aprile 2023, 9:39am

È stato un riavvio software dopo l’applicazione di alcuni aggiornamenti.

Questo:

Filesystem     Type     1K-blocks     Used Available Use% Mounted on
devtmpfs       devtmpfs      4096        0      4096   0% /dev
tmpfs          tmpfs      3869940        0   3869940   0% /dev/shm
tmpfs          tmpfs      1547980    17600   1530380   2% /run
/dev/mapper/cs-root xfs       73364480 16750976  56613504  23% /
/dev/sda2      xfs        1038336   395812    642524  39% /boot
/dev/mapper/cs-home xfs      893122476 10838052 882284424   2% /home
/dev/sda1      vfat        613160     7644    605516   2% /boot/efi
overlay        overlay   73364480 16750976  56613504  23% /var/lib/docker/overlay2/a9c2622d4167f08ff4697a4c49febf55a7e460e087c41c05e6c1bbd321b13f62/merged
overlay        overlay   73364480 16750976  56613504  23% /var/lib/docker/overlay2/765273939fa75096c994588ba9c9bac6b5a9b909a60a962169b83f7c8a213b7f/merged
tmpfs          tmpfs       773988        4    773984   1% /run/user/1000

Ed_S · 18 Aprile 2023, 9:53am

Grazie, non ho familiarità con XFS ma una rapida ricerca suggerisce che dovrebbe essere sicuro da usare. Forse xfs_info / sarà utilmente informativo.

shyguy · 18 Aprile 2023, 3:54pm

ok, quindi copia l’intera directory in un posto sicuro:

/var/discourse/shared

in questo modo puoi provare a ripristinare dai backup o persino provare a riparare il database corrotto se necessario.

quindi elimina i tuoi container docker, fai un docker image prune -a, elimina /var/discourse e reinstalla discourse. quindi copia il file di backup più recente e prova a ripristinare.

Norike · 18 Aprile 2023, 9:23pm

Ci è riuscito. Grazie!

system · 18 Maggio 2023, 9:23pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Argomento		Risposte	Visualizzazioni
Bootstrap failed, please help :( Self-hosting	6	969	Gennaio 3, 2023
Postgres Errors on Rebuild Self-hosting	3	1626	Settembre 16, 2015
Building bootstrap error Self-hosting	46	2056	Dicembre 20, 2022
FAILED TO BOOTSTRAP, even on a fresh installation Self-hosting	4	743	Marzo 15, 2023
Discourse update doesn't wait for Postgress DB to shut down Self-hosting	5	986	Gennaio 28, 2022

Tentativo di recuperare un'installazione

Argomenti correlati