Some queries that don't finish

bartv · April 5, 2019, 5:23pm

The problem is slowly getting worse - the original queries are still running and almost exactly 2 hours later, 3 more started. They’re now occupying 2/12 of my cores. If this continues overnight it will turn into an actual problem.

Update: this query was introduced by this recent commit. Perhaps it’s missing an index on some tables? I have 4.1M records in the posts and post_search_data tables and 1.5M in the topics table, so the double LEFT JOIN could be a problem if something is not indexed correctly.

Not sure if this helps, but here’s an EXPLAIN of the query on my DB:

sudo -u postgres psql discourse -c "EXPLAIN SELECT posts.id FROM posts LEFT JOIN post_search_data pd ON pd.locale='en' AND pd.version=2 AND pd.post_id=posts.id LEFT JOIN topics ON topics.id=posts.topic_id WHERE pd.post_id IS NULL AND topics.id IS NOT NULL AND topics.deleted_at IS NULL AND posts.raw!='' ORDER BY posts.id DESC LIMIT 20000;";
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=2000.88..124179.27 rows=20000 width=4)
   ->  Nested Loop Anti Join  (cost=2000.88..29146177.32 rows=4770758 width=4)
         Join Filter: (pd.post_id = posts.id)
         ->  Gather Merge  (cost=1000.88..21351547.22 rows=4770860 width=4)
               Workers Planned: 2
               ->  Nested Loop  (cost=0.86..20799871.57 rows=1987858 width=4)
                     ->  Parallel Index Scan Backward using posts_pkey on posts  (cost=0.43..12078793.28 rows=1988522 width=8)
                           Filter: (raw <> ''::text)
                     ->  Index Only Scan using index_topics_on_id_and_deleted_at on topics  (cost=0.43..4.39 rows=1 width=4)
                           Index Cond: ((id = posts.topic_id) AND (id IS NOT NULL) AND (deleted_at IS NULL))
         ->  Materialize  (cost=1000.00..495214.55 rows=102 width=4)
               ->  Gather  (cost=1000.00..495214.04 rows=102 width=4)
                     Workers Planned: 2
                     ->  Parallel Seq Scan on post_search_data pd  (cost=0.00..494203.84 rows=42 width=4)
                           Filter: (((locale)::text = 'en'::text) AND (version = 2))

Topic		Replies	Views
Long-Running Sidekiq Jobs Feature	21	1563	December 24, 2020
PostgreSQL runaway IO Installation	37	2947	March 17, 2021
Slow Sidekiq + Postmaster using 95%+ CPU (32 cores) after Postgresql Version Upgrade Installation server-resources	23	2874	June 28, 2020
Slow Profile Loads with 100GB+ database Feature	43	2860	January 1, 2021
Slow SQL query causes homepage to load in 2-4 sec Support	16	1627	July 1, 2018

Some queries that don't finish

Related topics