Erro ao atualizar para a versão mais recente do Discourse -- resolvido

Acabei de receber um erro na minha instalação do Discourse ao tentar atualizar para a versão 3.5.0.beta3-dev:

O Discourse falha ao iniciar. Quaisquer sugestões são muito apreciadas.

Status: Image is up to date for discourse/base:2.0.20250226-0128

...

I, [2025-04-18T15:16:34.677745 #1]  INFO -- : cd /var/www/discourse & su discourse -c 'LOAD_PLUGINS=0 bundle exec rake plugin:pull_compatible_all'
I, [2025-04-18T15:16:40.775977 #1]  INFO -- : discourse-bbcode-color is already at latest compatible version
discourse-cakeday is already at latest compatible version
discourse-chat-integration is already at latest compatible version
discourse-data-explorer is already at latest compatible version
discourse-math is already at latest compatible version
discourse-oauth2-basic is already at latest compatible version
discourse-rss-polling is already at latest compatible version
discourse-solved is already at latest compatible version
discourse-whos-online is already at latest compatible version
discourse-yearly-review is already at latest compatible version
docker_manager is already at latest compatible version

Erro:

I, [2025-04-18T15:16:50.782650 #1]  INFO -- : cd /var/www/discourse & su discourse -c 'bundle exec rake db:migrate'
2025-04-18 15:16:59.705 UTC [911] discourse@discourse ERROR:  null value in column "answer_post_id" of relation "discourse_solved_solved_topics" violates not-null constraint
2025-04-18 15:16:59.705 UTC [911] discourse@discourse DETAIL:  Failing row contains (768, 16673, null, -1, null, 2021-05-10 18:32:53.008822, 2021-05-10 18:32:53.008822).
2025-04-18 15:16:59.705 UTC [911] discourse@discourse STATEMENT:  INSERT INTO discourse_solved_solved_topics (
          topic_id,
          answer_post_id,
          topic_timer_id,
          accepter_user_id,
          created_at,
          updated_at
        )
        SELECT
          tc.topic_id,
          tc.answer_post_id,
          tc.topic_timer_id,
          tc.accepter_user_id,
          tc.created_at,
          tc.updated_at
        FROM (
          SELECT
            tc.topic_id,
            CAST(tc.value AS INTEGER) AS answer_post_id,
            CAST(tc2.value AS INTEGER) AS topic_timer_id,
            COALESCE(ua.acting_user_id, -1) AS accepter_user_id,
            tc.created_at,
            tc.updated_at,
            ROW_NUMBER() OVER (PARTITION BY tc.topic_id ORDER BY tc.created_at ASC) AS rn_topic,
            ROW_NUMBER() OVER (PARTITION BY CAST(tc.value AS INTEGER) ORDER BY tc.created_at ASC) AS rn_answer
          FROM topic_custom_fields tc
          LEFT JOIN topic_custom_fields tc2 ON tc2.topic_id = tc.topic_id AND tc2.name = 'solved_auto_close_topic_timer_id'
          LEFT JOIN user_actions ua ON ua.target_topic_id = tc.topic_id AND ua.action_type = 15
          WHERE tc.name = 'accepted_answer_post_id'
            AND tc.id > 0
            AND tc.id <= 0 + 10000
        ) tc
        WHERE tc.rn_topic = 1 AND tc.rn_answer = 1
        ON CONFLICT DO NOTHING

rake aborted!
StandardError: An error has occurred, all later migrations canceled: (StandardError)
1 curtida

Acabei de executar o discourse-doctor, que falha com o mesmo erro.

Detalhe adicional; a linha com falha:

DETAIL:  Failing row contains (1152, 16673, null, -1, null, 2021-05-10 18:32:53.008822, 2021-05-10 18:32:53.008822)

Se eu pudesse acessar o banco de dados sem o contêiner em execução, eu poderia excluir a linha…

Corrigido.

Foi preciso um pouco de manipulação do banco de dados. Usei a imagem docker do postgres para montar o banco de dados:

docker run --name postgres_container_2 -e POSTGRES_PASSWORD=mysecretpassword -d -p 5432:5432 -v /var/discourse/shared/standalone/postgres_data:/var/lib/postgresql/data postgres:15

Em seguida, localizei a linha da tabela com problema:

SELECT
          tc.topic_id,
          tc.answer_post_id,
          tc.topic_timer_id,
          tc.accepter_user_id,
          tc.created_at,
          tc.updated_at
        FROM (
          SELECT
            tc.topic_id,
            CAST(tc.value AS INTEGER) AS answer_post_id,
            CAST(tc2.value AS INTEGER) AS topic_timer_id,
            COALESCE(ua.acting_user_id, -1) AS accepter_user_id,
            tc.created_at,
            tc.updated_at,
            ROW_NUMBER() OVER (PARTITION BY tc.topic_id ORDER BY tc.created_at ASC) AS rn_topic,
            ROW_NUMBER() OVER (PARTITION BY CAST(tc.value AS INTEGER) ORDER BY tc.created_at ASC) AS rn_answer
          FROM topic_custom_fields tc
          LEFT JOIN topic_custom_fields tc2 ON tc2.topic_id = tc.topic_id AND tc2.name = 'solved_auto_close_topic_timer_id'
          LEFT JOIN user_actions ua ON ua.target_topic_id = tc.topic_id AND ua.action_type = 15
          WHERE tc.name = 'accepted_answer_post_id'
            AND tc.id > 0
            AND tc.id <= 0 + 10000
            AND tc.value IS NULL
        ) tc

Depois de garantir que isso selecionava apenas a linha do erro, mudei o select para um DELETE, cruzei os dedos e executei ./launcher rebuild app.

Após muito tempo esperando a imagem ser construída, agora tenho uma comunidade feliz e um site reconstruído, com correções de segurança incluídas.

Não sei o que causou a invalidez da linha mencionada acima, mas depois que ela foi removida, tudo ficou bem.

4 curtidas

Estas são instruções perigosas para acessar o banco de dados quando o discourse não está em execução.

Use por sua conta e risco.

2 curtidas

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.