Long-Running Sidekiq Jobs

Ghan · December 24, 2020, 4:19am

It still has to run the subquery to calculate the ranks on every topic across the entire posts table no matter how many ranks are actually being updated.

This subquery takes about 40 seconds to run, and then it is inside a nested loop that has to run for the join before we even get to the WHERE clause. That seems to be what is blowing up the query cost.

SELECT id, percent_rank()                                                     
    OVER (PARTITION BY topic_id ORDER BY SCORE DESC) as percent_rank 
FROM posts

Ghan · December 24, 2020, 5:16pm

I looked at the code and now I understand. This query is run repeatedly until it clears out all the posts whose rank needs to be updated, so when I was watching the SQL processes, I was seeing many different runs of this query over the course of the hours as it finished one run and started another.

However, it doesn’t seem like the LIMIT really helps the complexity of the query all that much. In fact, I’ve discovered that if I increase the limit to 200000, the query plan actually improves significantly.
With 20000, I see a huge nested loop with cost nearly 200 million:

-> Nested Loop (cost=6033694.44..190270554.32 rows=26680560 width=16)

If I just increase the limit to 200000, then it produces this query plan, which looks much cleaner:

                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Update on posts  (cost=7334394.89..8873863.82 rows=200000 width=833)
   ->  Nested Loop  (cost=7334394.89..8873863.82 rows=200000 width=833)
         ->  Subquery Scan on x  (cost=7334394.45..7370147.82 rows=200000 width=48)
               ->  Limit  (cost=7334394.45..7368147.82 rows=200000 width=12)
                     ->  Hash Join  (cost=7334394.45..11837188.36 rows=26680560 width=12)
                           Hash Cond: (posts_1.id = y.id)
                           Join Filter: ((posts_1.percent_rank IS NULL) OR (y.percent_rank <> posts_1.percent_rank))
                           ->  Hash Join  (cost=35403.20..2845221.73 rows=26703700 width=12)
                                 Hash Cond: (posts_1.topic_id = topics.id)
                                 ->  Seq Scan on posts posts_1  (cost=0.00..2739721.00 rows=26703700 width=16)
                                 ->  Hash  (cost=23918.11..23918.11 rows=918807 width=4)
                                       ->  Index Only Scan using topics_pkey on topics  (cost=0.42..23918.11 rows=918807 width=4)
                           ->  Hash  (cost=6834805.00..6834805.00 rows=26703700 width=12)
                                 ->  Subquery Scan on y  (cost=6033694.00..6834805.00 rows=26703700 width=12)
                                       ->  WindowAgg  (cost=6033694.00..6567768.00 rows=26703700 width=24)
                                             ->  Sort  (cost=6033694.00..6100453.25 rows=26703700 width=16)
                                                   Sort Key: posts_2.topic_id, posts_2.score DESC
                                                   ->  Seq Scan on posts posts_2  (cost=0.00..2739721.00 rows=26703700 width=16)
         ->  Index Scan using posts_pkey on posts  (cost=0.44..7.52 rows=1 width=789)
               Index Cond: (id = x.id)
 JIT:
   Functions: 30
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(23 rows)

I wonder what causes it to switch over. It may be worth increasing this at least after a big import like we’ve done while it clears out the backlog.

Topic		Replies	Views
Some queries that don't finish Support	18	897	June 18, 2022
PostgreSQL runaway IO Installation	37	2947	March 17, 2021
Very slow Sidekiq issue with large queue due to massive numbers of unread user notifications Installation	69	7209	May 11, 2020
Slow Sidekiq + Postmaster using 95%+ CPU (32 cores) after Postgresql Version Upgrade Installation server-resources	23	2874	June 28, 2020
Slow SQL query causes homepage to load in 2-4 sec Support	16	1627	July 1, 2018

Long-Running Sidekiq Jobs

Related topics