Queda total do fórum (Pressão de teste)

Meu fórum travou quando eu estava atualizando o software e agora desapareceu completamente. Recebo as seguintes mensagens de erro…

" Oops
O software que alimenta este fórum de discussão encontrou um problema inesperado. Pedimos desculpas pelo inconveniente.

Informações detalhadas sobre o erro foram registradas e uma notificação automática foi gerada. Vamos dar uma olhada.

Nenhuma ação adicional é necessária. No entanto, se a condição de erro persistir, você pode fornecer detalhes adicionais, incluindo passos para reproduzir o erro, postando um tópico de discussão na categoria de feedback do site."

O fórum que administro está em https://forum.testpressing.org/

Qualquer ajuda seria apreciada.

Eu também acabei de fazer o upgrade e meu container não está compilando e depois está travando. É muito lamentável.

Não reinicie a reconstrução quando ela travar, pois isso travou tudo quando eu fiz…

mesmo comportamento aqui. Eu estava vendo erros com os workers, reiniciei… travou… tentei reconstruir. Travamentos perpétuos agora

/var/discourse# ./launcher rebuild app
x86_64 arch detected.
WARNING: containers/app.yml file is world-readable. You can secure this file by running: chmod o-rwx containers/app.yml
Ensuring launcher is up to date
Fetching origin
Launcher is up-to-date
2.0.20240825-0027: Pulling from discourse/base
Digest: sha256:6de68cb49198b5281f79ed9401b3fe818c854d220dcf0238549fe2f2adb19146
Status: Image is up to date for discourse/base:2.0.20240825-0027
/usr/local/lib/ruby/gems/3.3.0/gems/pups-1.2.1/lib/pups.rb
/usr/local/bin/pups --stdin
I, [2024-08-27T21:43:42.091270 #1]  INFO -- : Reading from stdin
I, [2024-08-27T21:43:42.110405 #1]  INFO -- : File > /etc/service/postgres/run  chmod: +x  chown:
I, [2024-08-27T21:43:42.117678 #1]  INFO -- : File > /etc/service/postgres/log/run  chmod: +x  chown:
I, [2024-08-27T21:43:42.125472 #1]  INFO -- : File > /etc/runit/3.d/99-postgres  chmod: +x  chown:
I, [2024-08-27T21:43:42.132700 #1]  INFO -- : File > /root/install_postgres  chmod: +x  chown:
I, [2024-08-27T21:43:42.139622 #1]  INFO -- : File > /root/upgrade_postgres  chmod: +x  chown:
I, [2024-08-27T21:43:42.140454 #1]  INFO -- : Replacing data_directory = '/var/lib/postgresql/13/main' with data_directory = '/shared/postgres_data' in /etc/postgresql/13/main/postgresql.conf
I, [2024-08-27T21:43:42.141762 #1]  INFO -- : Replacing (?-mix:#?listen_addresses *=.*) with listen_addresses = '*' in /etc/postgresql/13/main/postgresql.conf
I, [2024-08-27T21:43:42.142675 #1]  INFO -- : Replacing (?-mix:#?synchronous_commit *=.*) with synchronous_commit = $db_synchronous_commit in /etc/postgresql/13/main/postgresql.conf
I, [2024-08-27T21:43:42.143534 #1]  INFO -- : Replacing (?-mix:#?shared_buffers *=.*) with shared_buffers = $db_shared_buffers in /etc/postgresql/13/main/postgresql.conf
I, [2024-08-27T21:43:42.144382 #1]  INFO -- : Replacing (?-mix:#?work_mem *=.*) with work_mem = $db_work_mem in /etc/postgresql/13/main/postgresql.conf
I, [2024-08-27T21:43:42.144912 #1]  INFO -- : Replacing (?-mix:#?default_text_search_config *=.*) with default_text_search_config = '$db_default_text_search_config' in /etc/postgresql/13/main/postgresql.conf
I, [2024-08-27T21:43:42.145541 #1]  INFO -- : Replacing (?-mix:#?checkpoint_segments *=.*) with checkpoint_segments = $db_checkpoint_segments in /etc/postgresql/13/main/postgresql.conf
I, [2024-08-27T21:43:42.146355 #1]  INFO -- : Replacing (?-mix:#?logging_collector *=.*) with logging_collector = $db_logging_collector in /etc/postgresql/13/main/postgresql.conf
I, [2024-08-27T21:43:42.146979 #1]  INFO -- : Replacing (?-mix:#?log_min_duration_statement *=.*) with log_min_duration_statement = $db_log_min_duration_statement in /etc/postgresql/13/main/postgresql.conf
I, [2024-08-27T21:43:42.147851 #1]  INFO -- : Replacing (?-mix:^#local +replication +postgres +peer$) with local replication postgres  peer in /etc/postgresql/13/main/pg_hba.conf
I, [2024-08-27T21:43:42.148557 #1]  INFO -- : Replacing (?-mix:^host.*all.*all.*127.*$) with host all all 0.0.0.0/0 md5 in /etc/postgresql/13/main/pg_hba.conf
I, [2024-08-27T21:43:42.149423 #1]  INFO -- : Replacing (?-mix:^host.*all.*all.*::1\/128.*$) with host all all ::/0 md5 in /etc/postgresql/13/main/pg_hba.conf
I, [2024-08-27T21:43:42.149931 #1]  INFO -- : > if [ -f /root/install_postgres ]; then
  /root/install_postgres && rm -f /root/install_postgres
elif [ -e /shared/postgres_run/.s.PGSQL.5432 ]; then
  socat /dev/null UNIX-CONNECT:/shared/postgres_run/.s.PGSQL.5432 || exit 0 && echo postgres already running stop container ; exit 1
fi

2024/08/27 21:43:42 socat[28] E connect(, AF=1 "/shared/postgres_run/.s.PGSQL.5432", 36): Connection refused
I, [2024-08-27T21:43:42.217004 #1]  INFO -- : Generating locales (this might take a while)...
Generation complete.

I, [2024-08-27T21:43:42.217327 #1]  INFO -- : > HOME=/var/lib/postgresql USER=postgres exec chpst -u postgres:postgres:ssl-cert -U postgres:postgres:ssl-cert /usr/lib/postgresql/13/bin/postmaster -D /etc/postgresql/13/main
I, [2024-08-27T21:43:42.220344 #1]  INFO -- : Terminating async processes
2024-08-27 21:43:42.300 UTC [30] LOG:  starting PostgreSQL 13.16 (Debian 13.16-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2024-08-27 21:43:42.300 UTC [30] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2024-08-27 21:43:42.300 UTC [30] LOG:  listening on IPv6 address "::", port 5432
2024-08-27 21:43:42.303 UTC [30] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-08-27 21:43:42.310 UTC [31] LOG:  database system was interrupted; last known up at 2024-08-27 21:41:14 UTC
2024-08-27 21:43:42.503 UTC [31] LOG:  database system was not properly shut down; automatic recovery in progress
2024-08-27 21:43:42.507 UTC [31] LOG:  redo starts at 38C/55C02EA0
2024-08-27 21:43:42.507 UTC [31] LOG:  invalid record length at 38C/55C02ED8: wanted 24, got 0
2024-08-27 21:43:42.507 UTC [31] LOG:  redo done at 38C/55C02EA0
2024-08-27 21:43:42.540 UTC [30] LOG:  database system is ready to accept connections

Simplesmente fica parado lá indefinidamente… nunca atribui portas ao contêiner ou inicia o aplicativo rails ou qualquer coisa, pelo que posso dizer.

Se você executar

./launcher start app

ele voltará a funcionar?

Não… há um contêiner zumbi que ./launcher rebuild app cria, o que gera a saída acima. É assim que o contêiner se parece. Ele começa a construir a partir da imagem base do discourse, mas depois trava, como mencionado acima. Ele não se registra como o aplicativo discourse.

/var/discourse# docker ps -a
CONTAINER ID        IMAGE                              COMMAND                  CREATED             STATUS              PORTS               NAMES
02ae320b72a0        discourse/base:2.0.20240825-0027   “/bin/bash -c ‘/usr/…”   7 minutes ago       Up 7 minutes                            sleepy_driscoll

Quando executo ./launcher start app, ele dá erro porque tenta iniciar um novo aplicativo e o PSQL está em execução em 5432 no contêiner zumbi. Se eu excluir o contêiner zumbi (e/ou as imagens), ele cria um novo contêiner e trava com os logs da mesma forma da minha postagem anterior.

Muito estressante e lamentável. Não sei como chegamos a este ponto. Desabilitei todos os plugins no meu app.yaml e tentei reconstruir.

Acho que esses logs são os mais relevantes para a situação do meu fórum

2024-08-27 21:43:42.300 UTC [30] LOG:  starting PostgreSQL 13.16 (Debian 13.16-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2024-08-27 21:43:42.300 UTC [30] LOG:  listening on IPv4 address \"0.0.0.0\", port 5432
2024-08-27 21:43:42.300 UTC [30] LOG:  listening on IPv6 address \"::\", port 5432
2024-08-27 21:43:42.303 UTC [30] LOG:  listening on Unix socket \"/var/run/postgresql/.s.PGSQL.5432\"
2024-08-27 21:43:42.310 UTC [31] LOG:  database system was interrupted; last known up at 2024-08-27 21:41:14 UTC
2024-08-27 21:43:42.503 UTC [31] LOG:  database system was not properly shut down; automatic recovery in progress
2024-08-27 21:43:42.507 UTC [31] LOG:  redo starts at 38C/55C02EA0
2024-08-27 21:43:42.507 UTC [31] LOG:  invalid record length at 38C/55C02ED8: wanted 24, got 0
2024-08-27 21:43:42.507 UTC [31] LOG:  redo done at 38C/55C02EA0
2024-08-27 21:43:42.540 UTC [30] LOG:  database system is ready to accept connections

fica travado aqui para sempre… nunca compila assets, nunca inicia o app rails, nunca inicia o redis, etc.

1 curtida

oh ok… então isso é uma coisa para pelo menos um punhado de pessoas :frowning:

Mais Informações:

/var/discourse# ./launcher start app
x86_64 arch detectado.
AVISO: o arquivo containers/app.yml é legível por qualquer pessoa. Você pode proteger este arquivo executando: chmod o-rwx containers/app.yml

+ /usr/bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=8 -e UNICORN_SIDEKIQS=1 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e LETSENCRYPT_DIR=/shared/letsencrypt -e DISCOURSE_FORCE_HTTPS=true -e DISCOURSE_HOSTNAME=redacted.com -e DISCOURSE_DEVELOPER_EMAILS=redacted -e DISCOURSE_SMTP_ADDRESS=smtp.redacted.com -e DISCOURSE_SMTP_PORT=587 -e DISCOURSE_SMTP_USER_NAME=postmaster@redacted -e DISCOURSE_SMTP_PASSWORD=redacted -e DISCOURSE_SMTP_ENABLE_START_TLS=true -e LETSENCRYPT_ACCOUNT_EMAIL=redacted -h discourse-beta-ubuntu-app -e DOCKER_HOST_IP=172.17.0.1 --name app -t -p 80:80 -p 443:443 -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:52:ee:ee:62:b2 local_discourse/app /sbin/boot
Unable to find image 'local_discourse/app:latest' locally
/usr/bin/docker: Error response from daemon: pull access denied for local_discourse/app, repository does not exist or may require 'docker login'.
See '/usr/bin/docker run --help'.

Quando tento iniciar o aplicativo :point_up_2: … ele reclama que a imagem local_discourse/app não está lá. O que está correto:

/var/discourse# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
discourse/base      2.0.20240825-0027   9dc96b6115cb        2 days ago          3.38GB

mas tentar baixar e construir a imagem não está funcionando, devido ao banco de dados travado

2 curtidas

Veja

para a solução.