长时间运行的 Sidekiq 任务

Ghan · 2020 年12 月 24 日 04:19

无论实际要更新多少个排名，它仍然必须运行子查询来计算整个 posts 表中每个主题的排名。

这个子查询大约需要 40 秒才能运行，而且它位于一个嵌套循环内部，该循环必须在进入 WHERE 子句之前为连接操作执行。这似乎就是导致查询成本激增的原因。

SELECT id, percent_rank()                                                     \n    OVER (PARTITION BY topic_id ORDER BY SCORE DESC) as percent_rank \nFROM posts

Ghan · 2020 年12 月 24 日 17:16

我查看了代码，现在明白了。这个查询会反复执行，直到清除所有需要更新排名的帖子。因此，当我观察 SQL 进程时，在数小时内看到该查询多次运行，每完成一次运行后紧接着开始下一次。

不过，LIMIT 似乎并没有显著降低查询的复杂度。事实上，我发现如果将 LIMIT 增加到 200000，查询计划反而会有显著改善。
当 LIMIT 为 20000 时，我看到的嵌套循环成本接近 2 亿：

-> Nested Loop (cost=6033694.44..190270554.32 rows=26680560 width=16)

如果我将 LIMIT 直接增加到 200000，生成的查询计划看起来就干净多了：

                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Update on posts  (cost=7334394.89..8873863.82 rows=200000 width=833)
   ->  Nested Loop  (cost=7334394.89..8873863.82 rows=200000 width=833)
         ->  Subquery Scan on x  (cost=7334394.45..7370147.82 rows=200000 width=48)
               ->  Limit  (cost=7334394.45..7368147.82 rows=200000 width=12)
                     ->  Hash Join  (cost=7334394.45..11837188.36 rows=26680560 width=12)
                           Hash Cond: (posts_1.id = y.id)
                           Join Filter: ((posts_1.percent_rank IS NULL) OR (y.percent_rank <> posts_1.percent_rank))
                           ->  Hash Join  (cost=35403.20..2845221.73 rows=26703700 width=12)
                                 Hash Cond: (posts_1.topic_id = topics.id)
                                 ->  Seq Scan on posts posts_1  (cost=0.00..2739721.00 rows=26703700 width=16)
                                 ->  Hash  (cost=23918.11..23918.11 rows=918807 width=4)
                                       ->  Index Only Scan using topics_pkey on topics  (cost=0.42..23918.11 rows=918807 width=4)
                           ->  Hash  (cost=6834805.00..6834805.00 rows=26703700 width=12)
                                 ->  Subquery Scan on y  (cost=6033694.00..6834805.00 rows=26703700 width=12)
                                       ->  WindowAgg  (cost=6033694.00..6567768.00 rows=26703700 width=24)
                                             ->  Sort  (cost=6033694.00..6100453.25 rows=26703700 width=16)
                                                   Sort Key: posts_2.topic_id, posts_2.score DESC
                                                   ->  Seq Scan on posts posts_2  (cost=0.00..2739721.00 rows=26703700 width=16)
         ->  Index Scan using posts_pkey on posts  (cost=0.44..7.52 rows=1 width=789)
               Index Cond: (id = x.id)
 JIT:
   Functions: 30
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(23 rows)

我不确定是什么导致了这种切换。在我们刚完成的大规模导入之后，至少暂时提高这个限制可能是值得的，以便尽快清理积压任务。

话题		回复	浏览量
Some queries that don't finish Support	18	959	2022 年6 月 18 日
PostgreSQL runaway IO Self-hosting	37	3098	2021 年3 月 17 日
Very slow Sidekiq issue with large queue due to massive numbers of unread user notifications Self-hosting	69	7552	2020 年5 月 11 日
Slow Sidekiq + Postmaster using 95%+ CPU (32 cores) after Postgresql Version Upgrade Self-hosting server-resources	23	3081	2020 年6 月 28 日
Slow SQL query causes homepage to load in 2-4 sec Support	16	1674	2018 年7 月 1 日

长时间运行的 Sidekiq 任务

相关话题