長時間実行されるSidekiqジョブ

Ghan · 2020 年 12 月 24 日午前 4:19

それでも、実際に更新されるランクの数に関係なく、すべてのトピックのすべての投稿テーブルに対してランクを計算するサブクエリを実行する必要があります。

このサブクエリの実行には約40秒かかり、さらに WHERE 句に到達する前に結合のために実行されるネストされたループの中にあります。これがクエリのコストを急上昇させている要因のようです。

SELECT id, percent_rank()
    OVER (PARTITION BY topic_id ORDER BY SCORE DESC) as percent_rank
FROM posts

Ghan · 2020 年 12 月 24 日午後 5:16

コードを確認して、今では理解できました。このクエリは、順位を更新する必要がある投稿がすべて処理されるまで繰り返し実行されます。そのため、SQL プロセスを監視していた間、このクエリが数時間にわたって、1 回の実行が終了し、次の実行が始まるという形で何度も実行されている様子が見られました。

ただし、LIMIT がクエリの複雑さをそれほど軽減しているようには思えません。実際、LIMIT を 200000 に増やすと、クエリプランが劇的に改善することがわかりました。

20000 の場合、コストが約 2 億に近い巨大なネストループが見られます：

-> Nested Loop (cost=6033694.44..190270554.32 rows=26680560 width=16)

LIMIT を単に 200000 に増やすと、以下のように非常にクリーンなクエリプランが生成されます：

                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Update on posts  (cost=7334394.89..8873863.82 rows=200000 width=833)
   ->  Nested Loop  (cost=7334394.89..8873863.82 rows=200000 width=833)
         ->  Subquery Scan on x  (cost=7334394.45..7370147.82 rows=200000 width=48)
               ->  Limit  (cost=7334394.45..7368147.82 rows=200000 width=12)
                     ->  Hash Join  (cost=7334394.45..11837188.36 rows=26680560 width=12)
                           Hash Cond: (posts_1.id = y.id)
                           Join Filter: ((posts_1.percent_rank IS NULL) OR (y.percent_rank <> posts_1.percent_rank))
                           ->  Hash Join  (cost=35403.20..2845221.73 rows=26703700 width=12)
                                 Hash Cond: (posts_1.topic_id = topics.id)
                                 ->  Seq Scan on posts posts_1  (cost=0.00..2739721.00 rows=26703700 width=16)
                                 ->  Hash  (cost=23918.11..23918.11 rows=918807 width=4)
                                       ->  Index Only Scan using topics_pkey on topics  (cost=0.42..23918.11 rows=918807 width=4)
                           ->  Hash  (cost=6834805.00..6834805.00 rows=26703700 width=12)
                                 ->  Subquery Scan on y  (cost=6033694.00..6834805.00 rows=26703700 width=12)
                                       ->  WindowAgg  (cost=6033694.00..6567768.00 rows=26703700 width=24)
                                             ->  Sort  (cost=6033694.00..6100453.25 rows=26703700 width=16)
                                                   Sort Key: posts_2.topic_id, posts_2.score DESC
                                                   ->  Seq Scan on posts posts_2  (cost=0.00..2739721.00 rows=26703700 width=16)
         ->  Index Scan using posts_pkey on posts  (cost=0.44..7.52 rows=1 width=789)
               Index Cond: (id = x.id)
 JIT:
   Functions: 30
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(23 rows)

なぜ切り替わるのか気になります。今回のような大規模なインポート後に、少なくともバックログが解消される間は、この値を増やす価値があるかもしれません。

トピック		返信	表示
Some queries that don't finish Support	16	1004	2019 年 4 月 5 日
PostgreSQL runaway IO Self-hosting	35	3351	2021 年 3 月 3 日
Very slow Sidekiq issue with large queue due to massive numbers of unread user notifications Self-hosting	68	7770	2020 年 4 月 11 日
Slow Sidekiq + Postmaster using 95%+ CPU (32 cores) after Postgresql Version Upgrade Self-hosting server-resources	22	3289	2020 年 5 月 29 日
Slow SQL query causes homepage to load in 2-4 sec Support	15	1716	2018 年 2 月 7 日

長時間実行されるSidekiqジョブ

関連トピック