PostgreSQL 15 更新

tail /var/discourse/shared/standalone/log/var-log/postgres/current

的输出是什么?

root@ubuntu-s-1vcpu-1gb-nyc3-01:/var/discourse# tail /var/discourse/shared/standalone/log/var-log/postgres/current

	                INNER JOIN (SELECT topic_id, GREATEST(COUNT(*), 1) AS count
	             FROM posts
	             WHERE created_at >= '2025-02-01 04:44:07.521324'
	               AND deleted_at IS NULL
	               AND NOT hidden
	               AND post_type = 1
	               AND user_id <> -1
	             GROUP BY topic_id) c ON tt.topic_id = c.topic_id
	                WHERE tt.topic_id = top_topics.topic_id
	                  AND tt.daily_posts_count <> c.count
root@ubuntu-s-1vcpu-1gb-nyc3-01:/var/discourse# vim shared/standalone/log/var-log/postgres/current
root@ubuntu-s-1vcpu-1gb-nyc3-01:/var/discourse#
root@ubuntu-s-1vcpu-1gb-nyc3-01:/var/discourse#
root@ubuntu-s-1vcpu-1gb-nyc3-01:/var/discourse# cat shared/standalone/log/var-log/postgres/current
2025-02-02 04:42:42.765 UTC [542] LOG:  starting PostgreSQL 13.18 (Debian 13.18-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-02-02 04:42:42.768 UTC [542] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2025-02-02 04:42:42.768 UTC [542] LOG:  listening on IPv6 address "::", port 5432
2025-02-02 04:42:42.779 UTC [542] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-02-02 04:42:42.804 UTC [561] LOG:  database system was interrupted; last known up at 2025-02-02 04:37:10 UTC
2025-02-02 04:42:43.209 UTC [561] LOG:  database system was not properly shut down; automatic recovery in progress
2025-02-02 04:42:43.221 UTC [561] LOG:  redo starts at 2/DB0B59D0
2025-02-02 04:42:43.264 UTC [561] LOG:  invalid record length at 2/DB22D540: wanted 24, got 0
2025-02-02 04:42:43.264 UTC [561] LOG:  redo done at 2/DB22D518
2025-02-02 04:42:43.347 UTC [542] LOG:  database system is ready to accept connections
2025-02-02 04:43:49.036 UTC [1273] discourse@discourse LOG:  duration: 584.511 ms  statement: UPDATE topic_hot_scores thsOrig
	SET
	    recent_likes = COALESCE(new_values.likes_count, 0),
	    recent_posters = COALESCE(new_values.unique_participants, 0),
	    recent_first_bumped_at = COALESCE(new_values.first_bumped_at, ths.recent_first_bumped_at)
	FROM
	  topic_hot_scores ths
	  LEFT OUTER JOIN
	  (
	    SELECT
	        t.id AS topic_id,
	        COUNT(DISTINCT p.user_id) AS unique_participants,
	        (
	          SELECT COUNT(distinct pa.user_id)
	          FROM post_actions pa
	          JOIN posts p2 ON p2.id = pa.post_id
	          WHERE p2.topic_id = t.id
	            AND p2.post_type = 1
	            AND p2.deleted_at IS NULL
	            AND p2.user_deleted = false
	            AND pa.post_action_type_id = 2 -- action_type for 'like'
	            AND pa.created_at >= '2025-01-26 04:43:48.403603'
	            AND pa.deleted_at IS NULL
	        ) AS likes_count,
	        MIN(p.created_at) AS first_bumped_at
	    FROM
	        topics t
	    JOIN
	        posts p ON t.id = p.topic_id
	    WHERE
	        p.created_at >= '2025-01-26 04:43:48.403603'
	        AND t.archetype <> 'private_message'
	        AND t.deleted_at IS NULL
	        AND p.deleted_at IS NULL
	        AND p.user_deleted = false
	        AND t.created_at <= '2025-02-02 04:43:48.403603'
	        AND t.bumped_at >= '2025-01-26 04:43:48.403603'
	        AND p.created_at < '2025-02-02 04:43:48.403603'
	        AND p.created_at >= '2025-01-26 04:43:48.403603'
	        AND p.post_type = 1
	    GROUP BY
	        t.id
	  ) AS new_values
	ON ths.topic_id = new_values.topic_id

	WHERE thsOrig.topic_id = ths.topic_id

2025-02-02 04:44:04.629 UTC [1273] discourse@discourse LOG:  duration: 153.751 ms  statement: WITH x AS (SELECT
	                    u.id user_id,
	                    SUM(CASE WHEN p.id IS NOT NULL AND t.id IS NOT NULL AND ua.action_type = 2 THEN 1 ELSE 0 END) likes_received,
	                    SUM(CASE WHEN p.id IS NOT NULL AND t.id IS NOT NULL AND ua.action_type = 1 THEN 1 ELSE 0 END) likes_given,
	                    COALESCE((SELECT COUNT(topic_id) FROM topic_views AS v WHERE v.user_id = u.id AND v.viewed_at > '2025-02-01 04:44:04.456893'), 0) topics_entered,
	                    COALESCE((SELECT COUNT(id) FROM user_visits AS uv WHERE uv.user_id = u.id AND uv.visited_at > '2025-02-01 04:44:04.456893'), 0) days_visited,
	                    COALESCE((SELECT SUM(posts_read) FROM user_visits AS uv2 WHERE uv2.user_id = u.id AND uv2.visited_at > '2025-02-01 04:44:04.456893'), 0) posts_read,
	                    SUM(CASE WHEN t2.id IS NOT NULL AND ua.action_type = 4 THEN 1 ELSE 0 END) topic_count,
	                    SUM(CASE WHEN p.id IS NOT NULL AND t.id IS NOT NULL AND ua.action_type = 5 THEN 1 ELSE 0 END) post_count
	                  FROM users AS u
	                  LEFT OUTER JOIN user_actions AS ua ON ua.user_id = u.id AND COALESCE(ua.created_at, '2025-02-01 04:44:04.456893') > '2025-02-01 04:44:04.456893'
	                  LEFT OUTER JOIN posts AS p ON ua.target_post_id = p.id AND p.deleted_at IS NULL AND p.post_type = 1 AND NOT p.hidden
	                  LEFT OUTER JOIN topics AS t ON p.topic_id = t.id AND t.archetype = 'regular' AND t.deleted_at IS NULL AND t.visible
	                  LEFT OUTER JOIN topics AS t2 ON t2.id = ua.target_topic_id AND t2.archetype = 'regular' AND t2.deleted_at IS NULL AND t2.visible
	                  LEFT OUTER JOIN categories AS c ON t.category_id = c.id
	                  WHERE u.active
	                    AND u.silenced_till IS NULL
	                    AND u.id > 0
	                  GROUP BY u.id)
	      UPDATE directory_items di SET
	               likes_received = x.likes_received,
	               likes_given = x.likes_given,
	               topics_entered = x.topics_entered,
	               days_visited = x.days_visited,
	               posts_read = x.posts_read,
	               topic_count = x.topic_count,
	               post_count = x.post_count
	      FROM x
	      WHERE
	        x.user_id = di.user_id AND
	        di.period_type = 5 AND (
	        di.likes_received <> x.likes_received OR
	        di.likes_given <> x.likes_given OR
	        di.topics_entered <> x.topics_entered OR
	        di.days_visited <> x.days_visited OR
	        di.posts_read <> x.posts_read OR
	        di.topic_count <> x.topic_count OR
	        di.post_count <> x.post_count )


2025-02-02 04:44:07.657 UTC [1400] discourse@discourse LOG:  duration: 135.675 ms  statement: UPDATE top_topics
	                SET daily_posts_count = c.count
	                FROM top_topics tt
	                INNER JOIN (SELECT topic_id, GREATEST(COUNT(*), 1) AS count
	             FROM posts
	             WHERE created_at >= '2025-02-01 04:44:07.521324'
	               AND deleted_at IS NULL
	               AND NOT hidden
	               AND post_type = 1
	               AND user_id <> -1
	             GROUP BY topic_id) c ON tt.topic_id = c.topic_id
	                WHERE tt.topic_id = top_topics.topic_id
	                  AND tt.daily_posts_count <> c.count

如果在停止 app 容器后看到该输出,则可能意味着仍有某些东西在连接到数据库。请检查 pg_stat_activity 以查看是否可以隔离它。您可以尝试停止 app 容器中的这些服务以帮助缩小范围。

./launcher start app
./launcher enter app
sv stop nginx
sv stop unicorn
1 个赞

我刚检查了一下,只看到

2025-02-02 12:06:05.808 UTC [48] LOG:  received smart shutdown request

所以我想数据库没有正确关闭,而是在超时后被强制关闭了?

看起来是这样。您能否尝试按照之前的帖子中的步骤操作,看看是否有其他客户端连接到数据库?理想情况下,您应该在停止app容器之前停止所有连接到数据库的其他应用程序。

或者,您可以尝试终止所有已建立的数据库会话并快速停止postgres服务(最好在客户端应用程序重新连接之前),然后在确认数据库已干净关闭后,再次尝试重建。但是,我强烈建议您首先在终止连接之前,确定pg_stat_activity中列出的客户端应用程序受到的影响。

这是在首先停止nginxunicorn之后,从app容器中运行以终止客户端连接并停止postgres的示例命令。

sudo -u postgres psql -c "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE pid <> pg_backend_pid();" && sv stop postgres
3 个赞

只是一个快速更新:我们已经应用了一个补丁,如果数据库没有干净地关闭,它应该会返回一个更合适的错误消息

5 个赞

首先,非常感谢!在容器内手动停止服务似乎奏效了,我得以完成了升级所需的双容器重建(如上所述调整了 locale 变量——题外话:我查看了安装文档,似乎没有提到 locale;这或许可以添加进去)。

不幸的是,对问题进程的分析尚无定论。我只看到与 Postgres 相关的东西,比如 WalWriter、AutoVacuum 等。我唯一线索是,当我重启系统时,以及在更新期间进行索引更改后,我通常会看到 Postgres 的 CPU 负载在半小时左右很高。
今天重启后,当我检查 pg_stat_activity 时,我看到了两个长时间运行的查询(至少如果我对列的理解是正确的话):

SELECT "posts"."id" FROM "posts" INNER JOIN "topics" ON "topics"."deleted_at" IS NULL AND "topics"."id" = "posts"."topic_id" LEFT JOIN post_search_data ON post_id = posts.id WHERE "posts"."deleted_at" IS NULL AND (posts.raw != '') AND (topics.deleted_at IS NULL) AND (post_search_data.locale IS NULL OR post_search_data.locale != 'de' OR post_search_data.version != 5) ORDER BY posts.id DESC LIMIT 20000

SELECT "optimized_images".* FROM "optimized_images" WHERE "optimized_images"."upload_id" = 13 AND "optimized_images"."height" = 32 AND "optimized_images"."width" = 32 LIMIT 1

在上述 30 分钟后,第一个查询显然已完成,Postgres 的 CPU 负载现在正常。
我不确定为什么这两个查询会花费这么长时间,更不确定为什么第二个查询在超过一个小时后仍然显示在 pg_stat_activity 中,但可能正是这种长时间运行的查询阻止了 Postgres 服务在之前尝试重建时正确关闭。

如果您有任何线索,将不胜感激。但这可能也需要另开一个帖子讨论(如果是这样,请告诉我,我会编辑此帖)。

相关截图:

2 个赞

很高兴听到这奏效了!我现在已将这些步骤包含在 OP 中。

关于启动时 postgres CPU 负载过高的问题,这听起来像是你在 PG 版本更新之前就已经遇到了同样的问题。如果是这样,那与更新无关。(无论如何,建议在 PG 版本更新后运行一次 vacuum。请参阅 更新后任务。)

数据库/查询性能不佳可能由多种问题引起。可能是数据库中的索引损坏、主机资源配置不足、Discourse 中引入了有问题的查询的 bug 等。最好为此类问题开设一个 Support 主题。

4 个赞

6 个帖子已拆分为新主题:重建后网站离线(2025 年 2 月 4 日)

13 条帖子被拆分到一个新话题中:PostgreSQL 更新在中国失败

对我来说,似乎什么都不起作用。
我开了一个帖子 Errno::ENOENT: No such file or directory @ rb_sysopen - /etc/postgresql/15/main/postgresql.conf
有人能建议如何修复它吗?

2 个赞

今天在一些旧服务器上的几个站点上,我需要执行以下命令:

chown -R 101:104 /var/discourse/shared/standalone/postgres_*

我非常确定这涉及到多年的使用和旧的 Ubuntu 版本(这些版本有不同的用户 ID)。

2 个赞

很有意思。我更新了一个大约一年没更新过的网站,过程非常顺利。当时的版本是 v3.2.0.beta4+253。

2 个赞

我希望更新我的Discourse实例,但我想将数据库保持在Postgres 13。这可能吗?

1 个赞

是的,请参阅官方说明的这部分:

3 个赞

这里有关于双容器实例的说明吗?粗略一看我没看到,搜索也没找到。

1 个赞

这已在上面的"数据容器设置"中介绍。

1 个赞

此命令不会更改任何内容

1 个赞

如何查看我正在使用的 PostgreSQL 版本?

编辑:算了,我搞定了。

我有一些服务器使用 en_GB.UTF-8 区域设置,并且按照帖子顶部的说明操作,但进展不大:

./launcher stop app
x86_64 arch detected.
+ /usr/bin/docker stop -t 600 app
app
discourse@sands:/var/discourse$ docker run --rm --entrypoint=/bin/bash -e LANG='en_GB.UTF-8' -v /var/discourse/shared/standalone/postgres_data:/var/lib/postgresql/13/data -v /var/discourse/shared/standalone/postgres_data_new:/var/lib/postgresql/15/data tianon/postgres-upgrade:13-to-15 -c 'sed -i "s/^# $LANG/$LANG/" /etc/locale.gen && locale-gen && apt-get update && apt-get install -y postgresql-13-pgvector postgresql-15-pgvector && docker-upgrade'
Generating locales (this might take a while)...
  en_GB.UTF-8... done
  en_US.UTF-8... done
Generation complete.
Get:1 http://deb.debian.org/debian bookworm InRelease [151 kB]
Get:2 http://deb.debian.org/debian bookworm-updates InRelease [55.4 kB]
Get:3 http://deb.debian.org/debian-security bookworm-security InRelease [48.0 kB]
Get:4 http://apt.postgresql.org/pub/repos/apt bookworm-pgdg InRelease [129 kB]
Get:5 http://deb.debian.org/debian bookworm/main amd64 Packages [8,792 kB]
Get:6 http://deb.debian.org/debian bookworm-updates/main amd64 Packages [13.5 kB]
Get:7 http://deb.debian.org/debian-security bookworm-security/main amd64 Packages [243 kB]
Get:8 http://apt.postgresql.org/pub/repos/apt bookworm-pgdg/main amd64 Packages [360 kB]
Get:9 http://apt.postgresql.org/pub/repos/apt bookworm-pgdg/13 amd64 Packages [2,571 B]
Get:10 http://apt.postgresql.org/pub/repos/apt bookworm-pgdg/15 amd64 Packages [2,584 B]
Fetched 9,798 kB in 2s (4,547 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following NEW packages will be installed:
  postgresql-13-pgvector postgresql-15-pgvector
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 597 kB of archives.
After this operation, 1,540 kB of additional disk space will be used.
Get:1 http://apt.postgresql.org/pub/repos/apt bookworm-pgdg/main amd64 postgresql-13-pgvector amd64 0.8.0-1.pgdg120+1 [297 kB]
Get:2 http://apt.postgresql.org/pub/repos/apt bookworm-pgdg/main amd64 postgresql-15-pgvector amd64 0.8.0-1.pgdg120+1 [300 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 597 kB in 0s (1,274 kB/s)
Selecting previously unselected package postgresql-13-pgvector.
(Reading database ... 13891 files and directories currently installed.)
Preparing to unpack .../postgresql-13-pgvector_0.8.0-1.pgdg120+1_amd64.deb ...
Unpacking postgresql-13-pgvector (0.8.0-1.pgdg120+1) ...
Selecting previously unselected package postgresql-15-pgvector.
Preparing to unpack .../postgresql-15-pgvector_0.8.0-1.pgdg120+1_amd64.deb ...
Unpacking postgresql-15-pgvector (0.8.0-1.pgdg120+1) ...
Setting up postgresql-13-pgvector (0.8.0-1.pgdg120+1) ...
Setting up postgresql-15-pgvector (0.8.0-1.pgdg120+1) ...
Processing triggers for postgresql-common (267.pgdg120+1) ...
debconf: unable to initialize frontend: Dialog
debconf: (TERM is not set, so the dialog frontend is not usable.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
Building PostgreSQL dictionaries from installed myspell/hunspell packages...
Removing obsolete dictionary files:
Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok

*failure*
Consult the last few lines of "/var/lib/postgresql/15/data/pg_upgrade_output.d/20250207T165542.724/log/pg_upgrade_server.log" for
the probable cause of the failure.

connection to server on socket "/var/lib/postgresql/.s.PGSQL.50432" failed: No such file or directory

	Is the server running locally and accepting connections on that socket?

could not connect to source postmaster started with the command:
"/usr/lib/postgresql/13/bin/pg_ctl" -w -l "/var/lib/postgresql/15/data/pg_upgrade_output.d/20250207T165542.724/log/pg_upgrade_server.log" -D "/var/lib/postgresql/13/data" -o "-p 50432 -b  -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/postgresql'" start
Failure, exiting

其他人遇到过这个问题吗?有人有什么建议吗?

3 个赞