Backup fallito a causa di errori PG/SQL

Continuando la discussione da Backup to S3 command?:

Dopo aver tentato di eseguire il backup tramite ./launcher enter, ho scoperto ciò che sembra essere la ragione per cui i backup hanno smesso di funzionare.

pg_dump: Dumping the contents of table "topic_links" failed: PQgetResult() failed.
pg_dump: Error message from server: ERROR:  invalid memory alloc request size 18446744073709551613
pg_dump: The command was: COPY public.topic_links (id, topic_id, post_id, user_id, url, domain, internal, link_topic_id, created_at, updated_at, reflection, clicks, link_post_id, title, crawled_at, quote, extension) TO stdout;
EXCEPTION: pg_dump failed
/var/www/discourse/lib/backup_restore/backuper.rb:152:in `dump_public_schema'
/var/www/discourse/lib/backup_restore/backuper.rb:36:in `run'
script/discourse:80:in `backup'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/thor-1.0.1/lib/thor/command.rb:27:in `run'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/thor-1.0.1/lib/thor/invocation.rb:127:in `invoke_command'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/thor-1.0.1/lib/thor.rb:392:in `dispatch'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/thor-1.0.1/lib/thor/base.rb:485:in `start'
script/discourse:284:in `<top (required)>'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/cli/exec.rb:63:in `load'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/cli/exec.rb:63:in `kernel_load'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/cli/exec.rb:28:in `run'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/cli.rb:476:in `exec'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/vendor/thor/lib/thor/invocation.rb:127:in `invoke_command'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/vendor/thor/lib/thor.rb:399:in `dispatch'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/cli.rb:30:in `dispatch'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/vendor/thor/lib/thor/base.rb:476:in `start'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/cli.rb:24:in `start'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/exe/bundle:46:in `block in <top (required)>'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/lib/bundler/friendly_errors.rb:123:in `with_friendly_errors'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.1.4/exe/bundle:34:in `<top (required)>'
/usr/local/bin/bundle:23:in `load'
/usr/local/bin/bundle:23:in `<main>'
Deleting old backups...
Cleaning stuff up...
Removing '.tar' leftovers...
Marking backup as finished...
Refreshing disk stats...
Notifying 'system' of the end of the backup...
Finished!
[FAILED]

Ciò è particolarmente frustrante, dato che apparentemente non posso utilizzare nemmeno i vecchi backup; quando provo a ripristinarli, ricevo questo errore.

[2020-12-12 01:53:25] COPY 750 [2020-12-12 01:53:30] ERROR: null value in column "user_id" of relation "topic_users" violates not-null constraint [2020-12-12 01:53:30] DETAIL: Failing row contains (null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null). [2020-12-12 01:53:30] CONTEXT: COPY topic_users, line 623983: "\N \N \N \N \N \N \N \N \N \N \N \N \N \N \N \N" [2020-12-12 01:53:30] EXCEPTION: psql failed: CONTEXT: COPY topic_users, line 623983: "\N \N \N \N \N \N \N \N \N \N \N \N \N \N \N \N" [2020-12-12 01:53:30] /var/www/discourse/lib/backup_restore/database_restorer.rb:87:in `restore_dump' /var/www/discourse/lib/backup_restore/database_restorer.rb:26:in `restore' /var/www/discourse/lib/backup_restore/restorer.rb:51:in `run' /var/www/discourse/script/spawn_backup_restore.rb:23:in `restore' /var/www/discourse/script/spawn_backup_restore.rb:36:in `block in <main>' /var/www/discourse/script/spawn_backup_restore.rb:4:in `fork' /var/www/discourse/script/spawn_backup_restore.rb:4:in `<main>' [2020-12-12 01:53:30] Trying to rollback... [2020-12-12 01:53:30] Rolling back... [2020-12-12 01:53:30] Cleaning stuff up... [2020-12-12 01:53:30] Dropping functions from the discourse_functions schema... [2020-12-12 01:53:30] Removing tmp '/var/www/discourse/tmp/restores/default/2020-12-12-014753' directory... [2020-12-12 01:53:30] Unpausing sidekiq... [2020-12-12 01:53:30] Marking restore as finished...

Non credo ci sia alcuna manutenzione che possa eseguire per riportarlo allo stato in cui può essere salvato/trasferito?

Sembra esserci un problema.

È un’installazione standard? Di quale versione di PostgreSQL si tratta?

Hai detto che c’era qualche problema con questo server ed è per questo che stai cercando di spostartene?

L’installazione attuale è: 2.7.0.beta1

Il server ha una storia complessa (ha circa cinque anni): è stato inizialmente un’installazione standard self-hosted, poi trasferito all’hosting di Discourse, e infine riportato indietro. Abbiamo cercato di mantenerlo il più possibile standard.

Sono abbastanza sicuro che sia l’ultima versione.

Il server ha avuto interruzioni intermittenti, blocchi, ecc., che hanno influenzato le prestazioni e potrebbero aver avuto un impatto sul database. L’ultima volta che ho eseguito una pulizia del database, ho eliminato molti dati dal file di PostgreSQL.

Attualmente, il file di backup a cui ho accesso è di 1,5 GB, quindi è troppo grande per essere modificato con qualsiasi software disponibile sul mio PC al momento.

Ci sono altri problemi di cui sono a conoscenza, come il fallimento della migrazione delle immagini su S3, ecc., ma in precedenza non ci erano stati problemi con i backup, ecc.

Penso che copierei l’intera directory /var/discourse su un nuovo server per allontanarmi da qualsiasi problema su quel server e poi cercherei di rimettere tutto a posto.

Potresti avere un indice danneggiato, ma sono al telefono e non riesco a capire bene gli errori.

Just tried that - took some fussing about.

New system does not like it, pretty much just outright rejects the database.

Launcher is up-to-date
Stopping old container
+ /usr/bin/docker stop -t 60 app
app
cd /pups && git pull && /pups/bin/pups --stdin
Already up to date.
I, [2020-12-13T09:23:39.291334 #1]  INFO -- : Loading --stdin
I, [2020-12-13T09:23:39.296303 #1]  INFO -- : > DEBIAN_FRONTEND=noninteractive apt-get purge -y postgresql-13 postgresql-client-13 postgresql-contrib-13
I, [2020-12-13T09:23:41.511661 #1]  INFO -- : Reading package lists...
Building dependency tree...
Reading state information...
The following packages were automatically installed and are no longer required:
  libllvm7 pgdg-keyring postgresql-client-common postgresql-common ssl-cert
Use 'apt autoremove' to remove them.
The following packages will be REMOVED:
  postgresql-13* postgresql-client-13*
0 upgraded, 0 newly installed, 2 to remove and 0 not upgraded.
After this operation, 54.3 MB disk space will be freed.
(Reading database ... 43863 files and directories currently installed.)
Removing postgresql-13 (13.1-1.pgdg100+1) ...
invoke-rc.d: could not determine current runlevel
invoke-rc.d: policy-rc.d denied execution of stop.
Removing postgresql-client-13 (13.1-1.pgdg100+1) ...
Processing triggers for postgresql-common (223.pgdg100+1) ...
Building PostgreSQL dictionaries from installed myspell/hunspell packages...
Removing obsolete dictionary files:
(Reading database ... 42050 files and directories currently installed.)
Purging configuration files for postgresql-13 (13.1-1.pgdg100+1) ...
Dropping cluster main...

I, [2020-12-13T09:23:41.511861 #1]  INFO -- : > apt-get update && apt-get install -y postgresql-10 postgresql-client-10 postgresql-contrib-10
debconf: delaying package configuration, since apt-utils is not installed
I, [2020-12-13T09:23:51.192217 #1]  INFO -- : Hit:1 http://deb.debian.org/debian buster InRelease
Get:2 http://deb.debian.org/debian buster-updates InRelease [51.9 kB]
Get:3 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB]
Get:4 http://apt.postgresql.org/pub/repos/apt buster-pgdg InRelease [104 kB]
Hit:5 https://deb.nodesource.com/node_10.x buster InRelease
Get:6 http://security.debian.org/debian-security buster/updates/main amd64 Packages [254 kB]
Get:7 http://apt.postgresql.org/pub/repos/apt buster-pgdg/main amd64 Packages [216 kB]
Fetched 690 kB in 1s (525 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following package was automatically installed and is no longer required:
  libllvm7
Use 'apt autoremove' to remove it.
Suggested packages:
  postgresql-doc-10
The following NEW packages will be installed:
  postgresql-10 postgresql-client-10
0 upgraded, 2 newly installed, 0 to remove and 5 not upgraded.
Need to get 6,402 kB of archives.
After this operation, 30.6 MB of additional disk space will be used.
Get:1 http://apt.postgresql.org/pub/repos/apt buster-pgdg/main amd64 postgresql-client-10 amd64 10.15-1.pgdg100+1 [1,436 kB]
Get:2 http://apt.postgresql.org/pub/repos/apt buster-pgdg/main amd64 postgresql-10 amd64 10.15-1.pgdg100+1 [4,966 kB]
Fetched 6,402 kB in 2s (2,809 kB/s)
Selecting previously unselected package postgresql-client-10.
(Reading database ... 42050 files and directories currently installed.)
Preparing to unpack .../postgresql-client-10_10.15-1.pgdg100+1_amd64.deb ...
Unpacking postgresql-client-10 (10.15-1.pgdg100+1) ...
Selecting previously unselected package postgresql-10.
Preparing to unpack .../postgresql-10_10.15-1.pgdg100+1_amd64.deb ...
Unpacking postgresql-10 (10.15-1.pgdg100+1) ...
Setting up postgresql-client-10 (10.15-1.pgdg100+1) ...
update-alternatives: using /usr/share/postgresql/10/man/man1/psql.1.gz to provide /usr/share/man/man1/psql.1.gz (psql.1.gz) in auto mode
Setting up postgresql-10 (10.15-1.pgdg100+1) ...
Creating new PostgreSQL cluster 10/main ...
/usr/lib/postgresql/10/bin/initdb -D /var/lib/postgresql/10/main --auth-local peer --auth-host md5
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "C.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/postgresql/10/main ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default timezone ... Etc/UTC
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

Success. You can now start the database server using:

    pg_ctlcluster 10 main start

Ver Cluster Port Status Owner    Data directory              Log file
10  main    5432 down   postgres /var/lib/postgresql/10/main /var/log/postgresql/postgresql-10-main.log
update-alternatives: using /usr/share/postgresql/10/man/man1/postmaster.1.gz to provide /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode
invoke-rc.d: could not determine current runlevel
invoke-rc.d: policy-rc.d denied execution of start.
Processing triggers for postgresql-common (223.pgdg100+1) ...
Building PostgreSQL dictionaries from installed myspell/hunspell packages...
Removing obsolete dictionary files:

I, [2020-12-13T09:23:51.192964 #1]  INFO -- : > mkdir -p /shared/postgres_run
I, [2020-12-13T09:23:51.195917 #1]  INFO -- :
I, [2020-12-13T09:23:51.196235 #1]  INFO -- : > chown postgres:postgres /shared/postgres_run
I, [2020-12-13T09:23:51.198835 #1]  INFO -- :
I, [2020-12-13T09:23:51.199139 #1]  INFO -- : > chmod 775 /shared/postgres_run
I, [2020-12-13T09:23:51.201681 #1]  INFO -- :
I, [2020-12-13T09:23:51.202025 #1]  INFO -- : > rm -fr /var/run/postgresql
I, [2020-12-13T09:23:51.204199 #1]  INFO -- :
I, [2020-12-13T09:23:51.204549 #1]  INFO -- : > ln -s /shared/postgres_run /var/run/postgresql
I, [2020-12-13T09:23:51.207718 #1]  INFO -- :
I, [2020-12-13T09:23:51.208017 #1]  INFO -- : > socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGSQL.5432 || exit 0 && echo postgres already running stop container ; exit 1
2020/12/13 09:23:51 socat[1567] E connect(6, AF=1 "/shared/postgres_run/.s.PGSQL.5432", 36): No such file or directory
I, [2020-12-13T09:23:51.217014 #1]  INFO -- :
I, [2020-12-13T09:23:51.217294 #1]  INFO -- : > rm -fr /shared/postgres_run/.s*
I, [2020-12-13T09:23:51.220400 #1]  INFO -- :
I, [2020-12-13T09:23:51.220682 #1]  INFO -- : > rm -fr /shared/postgres_run/*.pid
I, [2020-12-13T09:23:51.223488 #1]  INFO -- :
I, [2020-12-13T09:23:51.223691 #1]  INFO -- : > mkdir -p /shared/postgres_run/10-main.pg_stat_tmp
I, [2020-12-13T09:23:51.225967 #1]  INFO -- :
I, [2020-12-13T09:23:51.226198 #1]  INFO -- : > chown postgres:postgres /shared/postgres_run/10-main.pg_stat_tmp
I, [2020-12-13T09:23:51.228306 #1]  INFO -- :
I, [2020-12-13T09:23:51.233016 #1]  INFO -- : File > /etc/service/postgres/run  chmod: +x  chown:
I, [2020-12-13T09:23:51.237345 #1]  INFO -- : File > /etc/runit/3.d/99-postgres  chmod: +x  chown:
I, [2020-12-13T09:23:51.237662 #1]  INFO -- : > chown -R root /var/lib/postgresql/10/main
I, [2020-12-13T09:23:51.244979 #1]  INFO -- :
I, [2020-12-13T09:23:51.245164 #1]  INFO -- : > [ ! -e /shared/postgres_data ] && install -d -m 0755 -o postgres -g postgres /shared/postgres_data && sudo -E -u postgres /usr/lib/postgresql/10/bin/initdb -D /shared/postgres_data || exit 0
I, [2020-12-13T09:23:51.246982 #1]  INFO -- :
I, [2020-12-13T09:23:51.247152 #1]  INFO -- : > chown -R postgres:postgres /shared/postgres_data
I, [2020-12-13T09:23:51.314470 #1]  INFO -- :
I, [2020-12-13T09:23:51.314888 #1]  INFO -- : > chown -R postgres:postgres /var/run/postgresql
I, [2020-12-13T09:23:51.318075 #1]  INFO -- :
I, [2020-12-13T09:23:51.318499 #1]  INFO -- : Replacing data_directory = '/var/lib/postgresql/10/main' with data_directory = '/shared/postgres_data' in /etc/postgresql/10/main/postgresql.conf
I, [2020-12-13T09:23:51.319171 #1]  INFO -- : Replacing (?-mix:#?listen_addresses *=.*) with listen_addresses = '*' in /etc/postgresql/10/main/postgresql.conf
I, [2020-12-13T09:23:51.319652 #1]  INFO -- : Replacing (?-mix:#?synchronous_commit *=.*) with synchronous_commit = $db_synchronous_commit in /etc/postgresql/10/main/postgresql.conf
I, [2020-12-13T09:23:51.320131 #1]  INFO -- : Replacing (?-mix:#?shared_buffers *=.*) with shared_buffers = $db_shared_buffers in /etc/postgresql/10/main/postgresql.conf
I, [2020-12-13T09:23:51.320672 #1]  INFO -- : Replacing (?-mix:#?work_mem *=.*) with work_mem = $db_work_mem in /etc/postgresql/10/main/postgresql.conf
I, [2020-12-13T09:23:51.321143 #1]  INFO -- : Replacing (?-mix:#?default_text_search_config *=.*) with default_text_search_config = '$db_default_text_search_config' in /etc/postgresql/10/main/postgresql.conf
I, [2020-12-13T09:23:51.321608 #1]  INFO -- : > install -d -m 0755 -o postgres -g postgres /shared/postgres_backup
I, [2020-12-13T09:23:51.324709 #1]  INFO -- :
I, [2020-12-13T09:23:51.325108 #1]  INFO -- : Replacing (?-mix:#?checkpoint_segments *=.*) with checkpoint_segments = $db_checkpoint_segments in /etc/postgresql/10/main/postgresql.conf
I, [2020-12-13T09:23:51.325597 #1]  INFO -- : Replacing (?-mix:#?logging_collector *=.*) with logging_collector = $db_logging_collector in /etc/postgresql/10/main/postgresql.conf
I, [2020-12-13T09:23:51.326097 #1]  INFO -- : Replacing (?-mix:#?log_min_duration_statement *=.*) with log_min_duration_statement = $db_log_min_duration_statement in /etc/postgresql/10/main/postgresql.conf
I, [2020-12-13T09:23:51.326619 #1]  INFO -- : Replacing (?-mix:^#local +replication +postgres +peer$) with local replication postgres  peer in /etc/postgresql/10/main/pg_hba.conf
I, [2020-12-13T09:23:51.327039 #1]  INFO -- : Replacing (?-mix:^host.*all.*all.*127.*$) with host all all 0.0.0.0/0 md5 in /etc/postgresql/10/main/pg_hba.conf
I, [2020-12-13T09:23:51.327456 #1]  INFO -- : > HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/10/bin/postmaster -D /etc/postgresql/10/main
I, [2020-12-13T09:23:51.329156 #1]  INFO -- : > sleep 5
2020-12-13 09:23:51.347 UTC [1583] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2020-12-13 09:23:51.347 UTC [1583] LOG:  listening on IPv6 address "::", port 5432
2020-12-13 09:23:51.349 UTC [1583] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-12-13 09:23:51.363 UTC [1583] FATAL:  database files are incompatible with server
2020-12-13 09:23:51.363 UTC [1583] DETAIL:  The database cluster was initialized with PG_CONTROL_VERSION 1300, but the server was compiled with PG_CONTROL_VERSION 1002.
2020-12-13 09:23:51.363 UTC [1583] HINT:  It looks like you need to initdb.
2020-12-13 09:23:51.365 UTC [1583] LOG:  database system is shut down
I, [2020-12-13T09:23:56.331811 #1]  INFO -- :
I, [2020-12-13T09:23:56.332043 #1]  INFO -- : > su postgres -c 'createdb discourse' || true
createdb: could not connect to database template1: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
I, [2020-12-13T09:23:56.394383 #1]  INFO -- :
I, [2020-12-13T09:23:56.394680 #1]  INFO -- : > su postgres -c 'psql discourse -c "create user discourse;"' || true
psql: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
I, [2020-12-13T09:23:56.454155 #1]  INFO -- :
I, [2020-12-13T09:23:56.454333 #1]  INFO -- : > su postgres -c 'psql discourse -c "grant all privileges on database discourse to discourse;"' || true
psql: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
I, [2020-12-13T09:23:56.508933 #1]  INFO -- :
I, [2020-12-13T09:23:56.509118 #1]  INFO -- : > su postgres -c 'psql discourse -c "alter schema public owner to discourse;"'
psql: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
I, [2020-12-13T09:23:56.560843 #1]  INFO -- :
I, [2020-12-13T09:23:56.561176 #1]  INFO -- : Terminating async processes


FAILED
--------------------
Pups::ExecError: su postgres -c 'psql discourse -c "alter schema public owner to discourse;"' failed with return #<Process::Status: pid 1609 exit 2>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params "su postgres -c 'psql $db_name -c \"alter schema public owner to $db_user;\"'"
da620ae9048b2cda99c7a0d24e38c9dfafba5d61fac8c64c2da2362a19338a76
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.

This bit seems a concern to me:

Using Discourse Doctor on both doesn’t seem to have any solutions.

Sta cercando di aggiornare PostgreSQL. Gran parte di ciò è previsto.

Forse lo riproverà e passerà al template pg 10 come descritto in Aggiornamento PostgreSQL 13

La riga per impedire l’aggiornamento è ancora presente nel file YML.

Posso accedere all’app e al SQL sul vecchio server, ma quando provo sul nuovo ricevo:

psql: errore: impossibile connettersi al server: File o directory non esistente
        Il server è in esecuzione localmente e accetta
        connessioni sul socket di dominio Unix "/var/run/postgresql/.s.PGSQL.5432"?

La riga sopra sembra essere il problema principale, @wincenworks.

Non posso dirti cosa fare, ma se fossi in te, @wincenworks, installerei Discourse da zero; e prima di costruire il container, imposterei il mio/i mio/i template per utilizzare PG10.

Quindi, una volta che avrai fatto partire la nuova istanza, potrai provare a ripristinare la tua istanza di Discourse dal tuo backup attuale di PG10, dalla riga di comando (non dall’interfaccia utente) all’interno del tuo container.

Spero che questo ti sia utile.

Sto cercando di ottenere una nuova istanza con PG10 in esecuzione in questo momento, ma continua a scadere il tempo nell’ultima fase del processo di registrazione.

Mi piacerebbe moltissimo farlo, ma:

  1. Non mi permette di creare un nuovo backup
  2. I backup precedenti non vengono ripristinati

Ecco perché sto cercando di trasferirli via scp.

Sì, non verrà ripristinato sulla tua configurazione attuale, come ho capito.

Hai provato a usare quel backup per il ripristino dopo un’installazione completamente nuova, come ho menzionato?

Prima, installa Discourse da zero, assicurati che sia al 100% funzionante con un template di installazione PG10.

Poi, prendi il tuo ultimo backup e ripristina quel backup (da riga di comando, non dall’interfaccia utente).