Slow sql queries

slow-sql

(Piotr Szal) #1

Continuing the discussion from Migrating a 10m PI forum - hardware requirements & performance:

In a test environment of 1mil topics, 5 mil posts, 250 categories and 500k users.

Next slow queries are:

Homepage when logged in.

Executing action: latest
T+4429.7 ms
Reader
17619.4 ms
lib/freedom_patches/active_record_base.rb:7:in `exec_sql'
lib/sql_builder.rb:67:in `exec'
lib/sql_builder.rb:81:in `map_exec'
app/models/topic_tracking_state.rb:151:in `report'
app/controllers/application_controller.rb:241:in `preload_current_user_data'
app/controllers/application_controller.rb:155:in `preload_json'
lib/middleware/anonymous_cache.rb:119:in `call'

SELECT u.id AS user_id,
       topics.id AS topic_id,
       topics.created_at,
       highest_post_number,
       last_read_post_number,
       c.name AS category_name,
       tu.notification_level
FROM users u
INNER JOIN user_stats AS us ON us.user_id = u.id
FULL OUTER JOIN topics ON 1=1
LEFT JOIN topic_users tu ON tu.topic_id = topics.id AND tu.user_id = u.id
LEFT JOIN categories c ON c.id = topics.category_id
WHERE u.id IN (-2) AND
      topics.archetype IN ('regular') AND
      (("topics"."deleted_at" IS NULL AND tu.last_read_post_number < topics.highest_post_number AND COALESCE(tu.notification_level, 1) >= 2) OR ("topics"."deleted_at" IS NULL AND topics.created_at >= GREATEST(CASE
              WHEN COALESCE(u.new_topic_duration_minutes, 2880) = -1 THEN u.created_at
              WHEN COALESCE(u.new_topic_duration_minutes, 2880) = -2 THEN COALESCE(u.previous_visit_at,u.created_at)
              ELSE ('2014-06-16 13:52:48.210894'::timestamp - INTERVAL '1 MINUTE' * COALESCE(u.new_topic_duration_minutes, 2880))
           END, us.new_since) AND tu.last_read_post_number IS NULL AND COALESCE(tu.notification_level, 2) >= 2)) AND
      (topics.visible OR u.admin OR u.moderator) AND
      topics.deleted_at IS NULL AND
      ( category_id IS NULL OR NOT c.read_restricted OR category_id IN (
          SELECT c2.id FROM categories c2
          JOIN category_groups cg ON cg.category_id = c2.id
          JOIN group_users gu ON gu.user_id = u.id AND cg.group_id = gu.group_id
          WHERE c2.read_restricted )
      )
      AND NOT EXISTS( SELECT 1 FROM category_users cu
                      WHERE cu.user_id = u.id AND
                           cu.category_id = topics.category_id AND
                           cu.notification_level = 0)
ORDER BY topics.bumped_at DESC LIMIT 500

Homepage when logged in.
4x of this. (monthly, yearly, weekly, daily)

Executing action: latest
T+2794.2 ms
Reader
666.8 ms
app/models/site_setting.rb:91:in `block in has_enough_topics_to_redirect_to_top'
app/models/site_setting.rb:90:in `each'
app/models/site_setting.rb:90:in `has_enough_topics_to_redirect_to_top'
app/models/user.rb:597:in `redirected_to_top_reason'
app/models/user.rb:588:in `should_be_redirected_to_top'
app/controllers/application_controller.rb:240:in `preload_current_user_data'
app/controllers/application_controller.rb:155:in `preload_json'
lib/middleware/anonymous_cache.rb:119:in `call'

SELECT COUNT(*) FROM "top_topics"  WHERE (monthly_score > 0)

Reporting new Slow Queries / Performance
(Sam Saffron) #2

hmm is this running multiple times? regardless it should be changed to do a LIMIT query and not zoom through the whole table.

SELECT * FROM "top_topics" WHERE (monthly_score > 0) LIMIT 1

Is a way better pattern.

cc @zogstrip


(Régis Hanol) #3

I think you missed the COUNT(*) part :wink:


(Catrin Schröder Jaross) #4

I am working with @Piioo and we found another very slow query on the topics-page.
It is used to find the suggested topics:

Executing action: show
T+9907.7 ms
Reader
3747.3 ms
lib/suggested_topics_builder.rb:29:in `add_results'
lib/topic_query.rb:57:in `list_suggested_for'
lib/topic_view.rb:229:in `suggested_topics'
app/serializers/topic_view_serializer.rb:80:in `details'
app/controllers/topics_controller.rb:387:in `block (2 levels) in perform_show_response'
app/controllers/topics_controller.rb:384:in `perform_show_response'
app/controllers/topics_controller.rb:61:in `show'
lib/middleware/anonymous_cache.rb:119:in `call'
SELECT  "topics".* FROM "topics" LEFT OUTER JOIN topic_users AS tu ON (topics.id = tu.topic_id AND tu.user_id = -2) WHERE ("topics"."deleted_at" IS NULL) AND (topics.archetype IN ('regular')) AND "topics"."closed" = 'f' AND "topics"."archived" = 'f' AND (topics.id NOT IN (7173120,7175970,7175972,7175971)) AND (topics.id NOT IN (7173120)) AND "topics"."visible" = 't'  ORDER BY CASE WHEN topics.category_id = 43 THEN 0 ELSE 1 END, RANDOM() LIMIT 5

Any optimization ideas?


(Sam Saffron) #5

You don’t need to count all the rows in a table to figure out there are more than 30 rows…


(Piotr Szal) #6

With 1 mil topics and pinnend globally topics the TopicList (Startpage) needs 12 sec to load on eight core and 8GB memory system.

Without the ordering of pinned topics (default_ordering) it needs only 2 sec.

On my dev with 23k Topics:

EXPLAIN SELECT  "topics".*, "categories".*, "topic_only_relative_urls_categories".* FROM "topics" LEFT OUTER JOIN "categories" ON "categories"."id" = "topics"."category_id" LEFT OUTER JOIN "topics" "topic_only_relative_urls_categories" ON "topic_only_relative_urls_categories"."id" = "categories"."topic_id" AND ("topics"."deleted_at" IS NULL) WHERE ("topics"."deleted_at" IS NULL) AND (topics.archetype IN ('regular','question')) AND (COALESCE(categories.topic_id, 0) <> topics.id) AND "topics"."visible" = 't' AND (topics.category_id IS NULL or topics.category_id IN (238,189,88,254))  ORDER BY topics.pinned_globally, topics.bumped_at DESC LIMIT 30;
                                                                                                  QUERY PLAN

 Limit  (cost=19470.24..19470.32 rows=30 width=2703)
   ->  Sort  (cost=19470.24..19528.89 rows=23458 width=2703)
     Sort Key: topics.pinned_globally, topics.bumped_at
     ->  Hash Left Join  (cost=3824.30..18777.42 rows=23458 width=2703)
           Hash Cond: (categories.topic_id = topic_only_relative_urls_categories.id)
           Join Filter: (topics.deleted_at IS NULL)
           ->  Hash Left Join  (cost=23.13..1543.10 rows=23458 width=1845)
                 Hash Cond: (topics.category_id = categories.id)
                 Filter: (COALESCE(categories.topic_id, 0) <> topics.id)
                 ->  Seq Scan on topics  (cost=0.00..1138.80 rows=23459 width=858)
                       Filter: ((deleted_at IS NULL) AND visible AND ((archetype)::text = ANY ('{regular,question}'::text[])) AND ((category_id IS NULL) OR (category_id = ANY ('{238,189,88,254}'::integer[]))))
                 ->  Hash  (cost=23.06..23.06 rows=6 width=987)
                       ->  Seq Scan on categories  (cost=0.00..23.06 rows=6 width=987)
           ->  Hash  (cost=962.74..962.74 rows=23474 width=858)
                 ->  Seq Scan on topics topic_only_relative_urls_categories  (cost=0.00..962.74 rows=23474 width=858)

(15 rows)


(Sam Saffron) #7

Yeah that pinned stuff is a huge PITA I think at high scale we need to rethink our strategy here.


(Régis Hanol) #8

This performance issue is now fixed :goat:

https://github.com/discourse/discourse/commit/43496958feea02cae6f04cadf065854d34f66176


(Catrin Schröder Jaross) #9

ok, I optimized it myself :smiley:

I limited the suggested topics to the category via SiteSettings and now it is much better (… but not perfect though ;))

Executing action: show
T+11767.0 ms
Reader
73.1 ms
lib/suggested_topics_builder.rb:29:in `add_results'
lib/topic_query.rb:57:in `list_suggested_for'
lib/topic_view.rb:229:in `suggested_topics'
app/serializers/topic_view_serializer.rb:80:in `details'
app/controllers/topics_controller.rb:387:in `block (2 levels) in perform_show_response'
app/controllers/topics_controller.rb:384:in `perform_show_response'
app/controllers/topics_controller.rb:61:in `show'
lib/middleware/anonymous_cache.rb:119:in `call'
SELECT  "topics".* FROM "topics" LEFT OUTER JOIN topic_users AS tu ON (topics.id = tu.topic_id AND tu.user_id = -2) WHERE ("topics"."deleted_at" IS NULL) AND (topics.archetype IN ('regular')) AND "topics"."closed" = 'f' AND "topics"."archived" = 'f' AND (topics.id NOT IN (7173069,7175970,7175972,7175971)) AND (topics.id NOT IN (7173069)) AND "topics"."visible" = 't' AND "topics"."category_id" = 11  ORDER BY CASE WHEN topics.category_id = 11 THEN 0 ELSE 1 END, RANDOM() LIMIT 5   

(Sam Saffron) #10

order by random over the entire result set is a bad strategy. we should change this to maybe just select N random rows or something


Improve performance of Topic load - ORDER BY RANDOM
#11

We can further reduce the time needed for the topic detail view by adding index on the post_id in table topic_links. The @link_counts calculation in topic_view.rb resolves to the following query builder.where(‘l.post_id IN (:post_ids)’, post_ids: posts.map(&:id)) in topic_link.rb which without index adds always 200ms on a larger dataset im topic_links.


(Piotr Szal) #12

search needs 10min to execute.

Here one query.

Executing action: query
T+751.2 ms
Reader
1601079.4 ms
lib/search.rb:216:in `topic_search'
lib/search.rb:85:in `find_grouped_results'
lib/search.rb:71:in `execute'
app/controllers/search_controller.rb:38:in `query'
lib/middleware/anonymous_cache.rb:119:in `call'
SELECT  "posts"."id" AS t0_r0, "posts"."user_id" AS t0_r1, "posts"."topic_id" AS t0_r2, "posts"."post_number" AS t0_r3, "posts"."raw" AS t0_r4, "posts"."cooked" AS t0_r5, "posts"."created_at" AS t0_r6, "posts"."updated_at" AS t0_r7, "posts"."reply_to_post_number" AS t0_r8, "posts"."reply_count" AS t0_r9, "posts"."quote_count" AS t0_r10, "posts"."deleted_at" AS t0_r11, "posts"."off_topic_count" AS t0_r12, "posts"."like_count" AS t0_r13, "posts"."incoming_link_count" AS t0_r14, "posts"."bookmark_count" AS t0_r15, "posts"."avg_time" AS t0_r16, "posts"."score" AS t0_r17, "posts"."reads" AS t0_r18, "posts"."post_type" AS t0_r19, "posts"."vote_count" AS t0_r20, "posts"."sort_order" AS t0_r21, "posts"."last_editor_id" AS t0_r22, "posts"."hidden" AS t0_r23, "posts"."hidden_reason_id" AS t0_r24, "posts"."notify_moderators_count" AS t0_r25, "posts"."spam_count" AS t0_r26, "posts"."illegal_count" AS t0_r27, "posts"."inappropriate_count" AS t0_r28, "posts"."last_version_at" AS t0_r29, "posts"."user_deleted" AS t0_r30, "posts"."reply_to_user_id" AS t0_r31, "posts"."percent_rank" AS t0_r32, "posts"."notify_user_count" AS t0_r33, "posts"."like_score" AS t0_r34, "posts"."deleted_by_id" AS t0_r35, "posts"."edit_reason" AS t0_r36, "posts"."word_count" AS t0_r37, "posts"."version" AS t0_r38, "posts"."cook_method" AS t0_r39, "posts"."wiki" AS t0_r40, "posts"."baked_at" AS t0_r41, "posts"."baked_version" AS t0_r42, "post_search_data"."post_id" AS t1_r0, "post_search_data"."search_data" AS t1_r1, "topics"."id" AS t2_r0, "topics"."title" AS t2_r1, "topics"."last_posted_at" AS t2_r2, "topics"."created_at" AS t2_r3, "topics"."updated_at" AS t2_r4, "topics"."views" AS t2_r5, "topics"."posts_count" AS t2_r6, "topics"."user_id" AS t2_r7, "topics"."last_post_user_id" AS t2_r8, "topics"."reply_count" AS t2_r9, "topics"."featured_user1_id" AS t2_r10, "topics"."featured_user2_id" AS t2_r11, "topics"."featured_user3_id" AS t2_r12, "topics"."avg_time" AS t2_r13, "topics"."deleted_at" AS t2_r14, "topics"."highest_post_number" AS t2_r15, "topics"."image_url" AS t2_r16, "topics"."off_topic_count" AS t2_r17, "topics"."like_count" AS t2_r18, "topics"."incoming_link_count" AS t2_r19, "topics"."bookmark_count" AS t2_r20, "topics"."star_count" AS t2_r21, "topics"."category_id" AS t2_r22, "topics"."visible" AS t2_r23, "topics"."moderator_posts_count" AS t2_r24, "topics"."closed" AS t2_r25, "topics"."archived" AS t2_r26, "topics"."bumped_at" AS t2_r27, "topics"."has_summary" AS t2_r28, "topics"."vote_count" AS t2_r29, "topics"."archetype" AS t2_r30, "topics"."featured_user4_id" AS t2_r31, "topics"."notify_moderators_count" AS t2_r32, "topics"."spam_count" AS t2_r33, "topics"."illegal_count" AS t2_r34, "topics"."inappropriate_count" AS t2_r35, "topics"."pinned_at" AS t2_r36, "topics"."score" AS t2_r37, "topics"."percent_rank" AS t2_r38, "topics"."notify_user_count" AS t2_r39, "topics"."subtype" AS t2_r40, "topics"."slug" AS t2_r41, "topics"."auto_close_at" AS t2_r42, "topics"."auto_close_user_id" AS t2_r43, "topics"."auto_close_started_at" AS t2_r44, "topics"."deleted_by_id" AS t2_r45, "topics"."participant_count" AS t2_r46, "topics"."word_count" AS t2_r47, "topics"."excerpt" AS t2_r48, "topics"."pinned_globally" AS t2_r49, "categories"."id" AS t3_r0, "categories"."name" AS t3_r1, "categories"."color" AS t3_r2, "categories"."topic_id" AS t3_r3, "categories"."topic_count" AS t3_r4, "categories"."created_at" AS t3_r5, "categories"."updated_at" AS t3_r6, "categories"."user_id" AS t3_r7, "categories"."topics_year" AS t3_r8, "categories"."topics_month" AS t3_r9, "categories"."topics_week" AS t3_r10, "categories"."slug" AS t3_r11, "categories"."description" AS t3_r12, "categories"."text_color" AS t3_r13, "categories"."read_restricted" AS t3_r14, "categories"."auto_close_hours" AS t3_r15, "categories"."post_count" AS t3_r16, "categories"."latest_post_id" AS t3_r17, "categories"."latest_topic_id" AS t3_r18, "categories"."position" AS t3_r19, "categories"."parent_category_id" AS t3_r20, "categories"."posts_year" AS t3_r21, "categories"."posts_month" AS t3_r22, "categories"."posts_week" AS t3_r23, "categories"."email_in" AS t3_r24, "categories"."email_in_allow_strangers" AS t3_r25, "categories"."topics_day" AS t3_r26, "categories"."posts_day" AS t3_r27 FROM "posts" LEFT OUTER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" LEFT OUTER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) LEFT OUTER JOIN "categories" ON "categories"."id" = "topics"."category_id" WHERE ("posts"."deleted_at" IS NULL) AND (post_search_data.search_data @@ TO_TSQUERY('german', 'liebe:*')) AND "topics"."deleted_at" IS NULL AND (topics.visible) AND (topics.archetype <> 'private_message') AND ((categories.id IS NULL) OR (NOT categories.read_restricted) OR (categories.id IN (270))) AND (posts.post_number = 1 OR posts.topic_id = 7173154)  ORDER BY CASE WHEN topics.id = 7173154 THEN 0 ELSE 1 END,
                           CASE WHEN topics.id = 7173154 THEN posts.post_number ELSE 999999 END, TS_RANK_CD(TO_TSVECTOR('german', topics.title), TO_TSQUERY('german', 'liebe:*')) DESC, TS_RANK_CD(post_search_data.search_data, TO_TSQUERY('german', 'liebe:*')) DESC, topics.bumped_at DESC LIMIT 6

(Neil Lalonde) #13

I added this today, thanks for catching that.

We’re doing way too many joins and filters outside of Postgres’ full text search, which negates the performance advantages of using a full text search engine. @sam suggests that we’ll need to stuff more info into the search index instead of using joins.


(Piotr Szal) #14

this is also a problem because this report is almost on every page and in the pull over message bus.
It needs 6 - 15 sec.


#15

Hi @neil
it is good to also add an index on user_id in category_users table. It is great that the N+1 problem was fixed by sam


but this one query will also run slow if you have 250 categories and 500k users that use the subscribe feature.


#16

Hi @zogstrip it will be good to also add seperate indexes on daily_score, monthly_score, weekly_score and yearly_score because even with your great fix it still takes per query 250ms on a larger dataset with 1mil entries in table top_topics.


(Dean Taylor) #17

I have noted a super slow query being executed, I usually end up with 4 of these running at the same time.

The query as reported by select * from pg_stat_activity WHERE state='active':

SELECT  "posts".* FROM "posts" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) WHERE ("posts"."deleted_at" IS NULL) AND (posts.id NOT IN (
               SELECT post_id from post_search_data
                WHERE locale = 'en'
              )) LIMIT 10000;

The explain:

postgres=# \c discourse
You are now connected to database "discourse" as user "postgres".
<eleted_at" IS NULL) WHERE ("posts"."deleted_at" IS NULL) AND (posts.id NOT IN (SELECT post_id from post_search_data WHERE locale = 'en')) LIMIT 10000;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..369906702.04 rows=10000 width=790)
   ->  Nested Loop  (cost=0.29..5964560612.70 rows=161245 width=790)
         ->  Seq Scan on posts  (cost=0.00..5964024013.90 rows=161249 width=790)
               Filter: ((deleted_at IS NULL) AND (NOT (SubPlan 1)))
               SubPlan 1
                 ->  Materialize  (cost=0.00..36186.44 rows=319903 width=4)
                       ->  Seq Scan on post_search_data  (cost=0.00..33336.93 rows=319903 width=4)
                             Filter: ((locale)::text = 'en'::text)
         ->  Index Only Scan using index_topics_on_id_and_deleted_at on topics  (cost=0.29..3.32 rows=1 width=4)
               Index Cond: ((id = posts.topic_id) AND (deleted_at IS NULL))
(10 rows)

discourse=#

(Sam Saffron) #18

That’s new will fix it next week first thing

At least push it to run a lot less


(Dean Taylor) #19

Hey @sam - FYI the slow query is killing my installation.

The queries never have a chance to complete and runs at 100% CPU all the time.

Any chance you can just remove the query for the moment?

Cheers.


(Sam Saffron) #20

Ill get it fixed, would be handy to have a backup of your db for local testing.