Slow sql queries

slow-sql

(Dean Taylor) #32

@sam - again thanks for your help on getting that one solved.

Purely as a point of reference my slowest query on a normal front page load whilst logged in as myself is now the following:

Is this what you expect?

Executing action: latest
T+398.3 ms
Reader
472.7 ms

app/models/topic_list.rb:24:in `topics'
app/serializers/topic_list_serializer.rb:16:in `include_more_topics_url?'
app/controllers/list_controller.rb:209:in `block (2 levels) in respond'
app/controllers/list_controller.rb:206:in `respond'
app/controllers/list_controller.rb:67:in `block (2 levels) in <class:ListController>'
lib/middleware/anonymous_cache.rb:119:in `call'
lib/middleware/unicorn_oobgc.rb:95:in `process_client'
SELECT  "topics"."id" AS t0_r0, "topics"."title" AS t0_r1, "topics"."last_posted_at" AS t0_r2, ...

More detail:


(Dean Taylor) #33

@sam I have a few more slow queries to highlight.

I have a few more - how would you like me to post these - just in this thread?

Here is the first one.

Executing action: show
T+388.8 ms
Reader
2303.2 ms

Query:

lib/freedom_patches/active_record_base.rb:7:in `exec_sql'
lib/sql_builder.rb:67:in `exec'
lib/sql_builder.rb:85:in `map_exec'
app/models/topic_tracking_state.rb:169:in `report'
app/controllers/application_controller.rb:253:in `preload_current_user_data'
app/controllers/application_controller.rb:154:in `preload_json'
lib/middleware/anonymous_cache.rb:119:in `call'
lib/middleware/unicorn_oobgc.rb:95:in `process_client'
    SELECT u.id AS user_id,
           topics.id AS topic_id,
           topics.created_at,
           highest_post_number,
           last_read_post_number,
           c.id AS category_id,
           tu.notification_level
    FROM users u
    INNER JOIN user_stats AS us ON us.user_id = u.id
    FULL OUTER JOIN topics ON 1=1
    LEFT JOIN topic_users tu ON tu.topic_id = topics.id AND tu.user_id = u.id
    LEFT JOIN categories c ON c.id = topics.category_id
    WHERE u.id IN (1) AND
          topics.archetype <> 'private_message' AND
          (("topics"."deleted_at" IS NULL AND tu.last_read_post_number < topics.highest_post_number AND COALESCE(tu.notification_level, 1) >= 2) OR ("topics"."deleted_at" IS NULL AND topics.created_at >= GREATEST(CASE
                  WHEN COALESCE(u.new_topic_duration_minutes, 2880) = -1 THEN u.created_at
                  WHEN COALESCE(u.new_topic_duration_minutes, 2880) = -2 THEN COALESCE(u.previous_visit_at,u.created_at)
                  ELSE ('2014-09-13 19:19:30.767377'::timestamp - INTERVAL '1 MINUTE' * COALESCE(u.new_topic_duration_minutes, 2880))
               END, us.new_since) AND tu.last_read_post_number IS NULL AND COALESCE(tu.notification_level, 2) >= 2)) AND
          (topics.visible OR u.admin OR u.moderator) AND
          topics.deleted_at IS NULL AND
          ( category_id IS NULL OR NOT c.read_restricted OR category_id IN (
              SELECT c2.id FROM categories c2
              JOIN category_groups cg ON cg.category_id = c2.id
              JOIN group_users gu ON gu.user_id = u.id AND cg.group_id = gu.group_id
              WHERE c2.read_restricted )
          )
          AND NOT EXISTS( SELECT 1 FROM category_users cu
                          WHERE cu.user_id = u.id AND
                               cu.category_id = topics.category_id AND
                               cu.notification_level = 0)

 ORDER BY topics.bumped_at DESC LIMIT 500

(Dean Taylor) #34

Here is the Explain:

 Limit  (cost=1.13..157529.96 rows=500 width=40)
   ->  Nested Loop Left Join  (cost=1.13..762755.74 rows=2421 width=40)
         Filter: ((topics.category_id IS NULL) OR (NOT c.read_restricted) OR (SubPlan 1))
         ->  Nested Loop  (cost=0.98..82167.39 rows=3195 width=40)
               Join Filter: (((tu.last_read_post_number < topics.highest_post_number) AND (COALESCE(tu.notification_level, 1) >= 2)) OR ((topics.created_at >= GREATEST(CASE WHEN (COALESCE(u.new_topic_duration_minutes, 2880) = (-1)) THEN u.created_at WHEN (COALESCE(u.new_topic_duration_minutes, 2880) = (-2)) THEN COALESCE(u.previous_visit_at, u.created_at) ELSE ('2014-09-13 19:19:30.767377'::timestamp without time zone - ('00:01:00'::interval * (COALESCE(u.new_topic_duration_minutes, 2880))::double precision)) END, us.new_since)) AND (tu.last_read_post_number IS NULL) AND (COALESCE(tu.notification_level, 2) >= 2)))
               ->  Nested Loop Left Join  (cost=0.70..81193.33 rows=27593 width=60)
                     Join Filter: ((tu.user_id = u.id) AND (tu.topic_id = topics.id))
                     ->  Nested Loop Anti Join  (cost=0.70..47983.51 rows=27593 width=52)
                           Join Filter: ((cu.user_id = u.id) AND (cu.category_id = topics.category_id))
                           ->  Nested Loop  (cost=0.70..47253.80 rows=39609 width=52)
                                 Join Filter: (topics.visible OR u.admin OR u.moderator)
                                 ->  Index Scan using index_topics_on_bumped_at on topics  (cost=0.41..46750.38 rows=39609 width=29)
                                       Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text))
                                 ->  Materialize  (cost=0.29..8.31 rows=1 width=26)
                                       ->  Index Scan using users_pkey on users u  (cost=0.29..8.30 rows=1 width=26)
                                             Index Cond: (id = 1)
                           ->  Materialize  (cost=0.00..36.55 rows=1 width=8)
                                 ->  Seq Scan on category_users cu  (cost=0.00..36.55 rows=1 width=8)
                                       Filter: ((user_id = 1) AND (notification_level = 0))
                     ->  Materialize  (cost=0.00..7617.44 rows=53 width=16)
                           ->  Seq Scan on topic_users tu  (cost=0.00..7617.18 rows=53 width=16)
                                 Filter: (user_id = 1)
               ->  Materialize  (cost=0.29..8.31 rows=1 width=12)
                     ->  Index Scan using user_stats_pkey on user_stats us  (cost=0.29..8.30 rows=1 width=12)
                           Index Cond: (user_id = 1)
         ->  Index Scan using categories_pkey on categories c  (cost=0.14..0.16 rows=1 width=5)
               Index Cond: (id = topics.category_id)
         SubPlan 1
           ->  Nested Loop  (cost=0.14..425.54 rows=1 width=4)
                 ->  Nested Loop  (cost=0.00..425.10 rows=1 width=4)
                       Join Filter: (cg.group_id = gu.group_id)
                       ->  Seq Scan on category_groups cg  (cost=0.00..9.97 rows=297 width=8)
                       ->  Materialize  (cost=0.00..406.22 rows=2 width=4)
                             ->  Seq Scan on group_users gu  (cost=0.00..406.21 rows=2 width=4)
                                   Filter: (user_id = u.id)
                 ->  Index Scan using categories_pkey on categories c2  (cost=0.14..0.43 rows=1 width=4)
                       Index Cond: (id = cg.category_id)
                       Filter: read_restricted
(38 rows)

(Sam Saffron) #35

I changed this query last week, be sure to try this on latest.


(Dean Taylor) #36

Just updated to latest - will find the new slow queries - there are still a few.


(Dean Taylor) #37

Here are 2 of 5 of the heavy hitting queries (if you would like the others just ask):

1

Executing action: latest
T+1006.0 ms
Reader
498.0 ms

app/models/topic_list.rb:25:in `topics'
app/serializers/topic_list_serializer.rb:21:in `include_more_topics_url?'
app/controllers/list_controller.rb:189:in `block (2 levels) in respond'
app/controllers/list_controller.rb:186:in `respond'
app/controllers/list_controller.rb:67:in `block (2 levels) in <class:ListController>'
lib/middleware/anonymous_cache.rb:119:in `call'
lib/middleware/unicorn_oobgc.rb:95:in `process_client'
SELECT  "topics"."id" AS t0_r0, "topics"."title" AS t0_r1, "topics"."last_posted_at" AS t0_r2, "topics"."created_at" AS t0_r3, "topics"."updated_at" AS t0_r4, "topics"."views" AS t0_r5, "topics"."posts_count" AS t0_r6, "topics"."user_id" AS t0_r7, "topics"."last_post_user_id" AS t0_r8, "topics"."reply_count" AS t0_r9, "topics"."featured_user1_id" AS t0_r10, "topics"."featured_user2_id" AS t0_r11, "topics"."featured_user3_id" AS t0_r12, "topics"."avg_time" AS t0_r13, "topics"."deleted_at" AS t0_r14, "topics"."highest_post_number" AS t0_r15, "topics"."image_url" AS t0_r16, "topics"."off_topic_count" AS t0_r17, "topics"."like_count" AS t0_r18, "topics"."incoming_link_count" AS t0_r19, "topics"."bookmark_count" AS t0_r20, "topics"."star_count" AS t0_r21, "topics"."category_id" AS t0_r22, "topics"."visible" AS t0_r23, "topics"."moderator_posts_count" AS t0_r24, "topics"."closed" AS t0_r25, "topics"."archived" AS t0_r26, "topics"."bumped_at" AS t0_r27, "topics"."has_summary" AS t0_r28, "topics"."vote_count" AS t0_r29, "topics"."archetype" AS t0_r30, "topics"."featured_user4_id" AS t0_r31, "topics"."notify_moderators_count" AS t0_r32, "topics"."spam_count" AS t0_r33, "topics"."illegal_count" AS t0_r34, "topics"."inappropriate_count" AS t0_r35, "topics"."pinned_at" AS t0_r36, "topics"."score" AS t0_r37, "topics"."percent_rank" AS t0_r38, "topics"."notify_user_count" AS t0_r39, "topics"."subtype" AS t0_r40, "topics"."slug" AS t0_r41, "topics"."auto_close_at" AS t0_r42, "topics"."auto_close_user_id" AS t0_r43, "topics"."auto_close_started_at" AS t0_r44, "topics"."deleted_by_id" AS t0_r45, "topics"."participant_count" AS t0_r46, "topics"."word_count" AS t0_r47, "topics"."excerpt" AS t0_r48, "topics"."pinned_globally" AS t0_r49, "categories"."id" AS t1_r0, "categories"."name" AS t1_r1, "categories"."color" AS t1_r2, "categories"."topic_id" AS t1_r3, "categories"."topic_count" AS t1_r4, "categories"."created_at" AS t1_r5, "categories"."updated_at" AS t1_r6, "categories"."user_id" AS t1_r7, "categories"."topics_year" AS t1_r8, "categories"."topics_month" AS t1_r9, "categories"."topics_week" AS t1_r10, "categories"."slug" AS t1_r11, "categories"."description" AS t1_r12, "categories"."text_color" AS t1_r13, "categories"."read_restricted" AS t1_r14, "categories"."auto_close_hours" AS t1_r15, "categories"."post_count" AS t1_r16, "categories"."latest_post_id" AS t1_r17, "categories"."latest_topic_id" AS t1_r18, "categories"."position" AS t1_r19, "categories"."parent_category_id" AS t1_r20, "categories"."posts_year" AS t1_r21, "categories"."posts_month" AS t1_r22, "categories"."posts_week" AS t1_r23, "categories"."email_in" AS t1_r24, "categories"."email_in_allow_strangers" AS t1_r25, "categories"."topics_day" AS t1_r26, "categories"."posts_day" AS t1_r27, "categories"."logo_url" AS t1_r28, "categories"."background_url" AS t1_r29, "categories"."allow_badges" AS t1_r30, "categories"."name_lower" AS t1_r31 FROM "topics" LEFT OUTER JOIN "categories" ON "categories"."id" = "topics"."category_id" LEFT OUTER JOIN topic_users AS tu ON (topics.id = tu.topic_id AND tu.user_id = 1) WHERE ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND (COALESCE(categories.topic_id, 0) <> topics.id) AND (NOT EXISTS(
                         SELECT 1 FROM category_users cu
                         WHERE cu.user_id = 1 AND
                               cu.category_id = topics.category_id AND
                               cu.notification_level = 0 AND
                               cu.category_id <> -1
                         ))  ORDER BY CASE
        WHEN topics.pinned_globally and (COALESCE(topics.pinned_at, '2010-01-01') > COALESCE(tu.cleared_pinned_at, '2010-01-01'))
          THEN 'infinity'
        ELSE topics.bumped_at
       END DESC LIMIT 30   

Explain:

 Limit  (cost=14408.03..14408.10 rows=30 width=1293)
   ->  Sort  (cost=14408.03..14477.01 rows=27592 width=1293)
         Sort Key: (CASE WHEN (topics.pinned_globally AND (COALESCE(topics.pinned_at, '2010-01-01 00:00:00'::timestamp without time zone) > COALESCE(tu.cleared_pinned_at, '2010-01-01 00:00:00'::timestamp without time zone))) THEN 'infinity'::timestamp without time zone ELSE topics.bumped_at END)
         ->  Hash Left Join  (cost=261.77..13593.11 rows=27592 width=1293)
               Hash Cond: (topics.id = tu.topic_id)
               ->  Hash Left Join  (cost=60.88..11426.07 rows=27592 width=1285)
                     Hash Cond: (topics.category_id = categories.id)
                     Filter: (COALESCE(categories.topic_id, 0) <> topics.id)
                     ->  Hash Anti Join  (cost=40.99..11093.43 rows=27593 width=487)
                           Hash Cond: (topics.category_id = cu.category_id)
                           ->  Seq Scan on topics  (cost=0.00..10659.02 rows=39609 width=487)
                                 Filter: ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text))
                           ->  Hash  (cost=40.98..40.98 rows=1 width=4)
                                 ->  Seq Scan on category_users cu  (cost=0.00..40.98 rows=1 width=4)
                                       Filter: ((category_id <> (-1)) AND (user_id = 1) AND (notification_level = 0))
                     ->  Hash  (cost=18.84..18.84 rows=84 width=798)
                           ->  Seq Scan on categories  (cost=0.00..18.84 rows=84 width=798)
               ->  Hash  (cost=200.23..200.23 rows=53 width=12)
                     ->  Bitmap Heap Scan on topic_users tu  (cost=4.83..200.23 rows=53 width=12)
                           Recheck Cond: (user_id = 1)
                           ->  Bitmap Index Scan on index_topic_users_on_user_id_and_topic_id  (cost=0.00..4.82 rows=53 width=0)
                                 Index Cond: (user_id = 1)
(22 rows)

2

Executing action: show
T+924.0 ms
Reader
497.1 ms

lib/suggested_topics_builder.rb:29:in `add_results'
lib/topic_query.rb:60:in `list_suggested_for'
lib/topic_view.rb:244:in `suggested_topics'
app/serializers/topic_view_serializer.rb:83:in `details'
app/controllers/application_controller.rb:202:in `render_json_dump'
app/controllers/topics_controller.rb:433:in `block (2 levels) in perform_show_response'
app/controllers/topics_controller.rb:426:in `perform_show_response'
app/controllers/topics_controller.rb:63:in `show'
lib/middleware/anonymous_cache.rb:119:in `call'
lib/middleware/unicorn_oobgc.rb:95:in `process_client'
SELECT  "topics"."id" AS t0_r0, "topics"."title" AS t0_r1, "topics"."last_posted_at" AS t0_r2, "topics"."created_at" AS t0_r3, "topics"."updated_at" AS t0_r4, "topics"."views" AS t0_r5, "topics"."posts_count" AS t0_r6, "topics"."user_id" AS t0_r7, "topics"."last_post_user_id" AS t0_r8, "topics"."reply_count" AS t0_r9, "topics"."featured_user1_id" AS t0_r10, "topics"."featured_user2_id" AS t0_r11, "topics"."featured_user3_id" AS t0_r12, "topics"."avg_time" AS t0_r13, "topics"."deleted_at" AS t0_r14, "topics"."highest_post_number" AS t0_r15, "topics"."image_url" AS t0_r16, "topics"."off_topic_count" AS t0_r17, "topics"."like_count" AS t0_r18, "topics"."incoming_link_count" AS t0_r19, "topics"."bookmark_count" AS t0_r20, "topics"."star_count" AS t0_r21, "topics"."category_id" AS t0_r22, "topics"."visible" AS t0_r23, "topics"."moderator_posts_count" AS t0_r24, "topics"."closed" AS t0_r25, "topics"."archived" AS t0_r26, "topics"."bumped_at" AS t0_r27, "topics"."has_summary" AS t0_r28, "topics"."vote_count" AS t0_r29, "topics"."archetype" AS t0_r30, "topics"."featured_user4_id" AS t0_r31, "topics"."notify_moderators_count" AS t0_r32, "topics"."spam_count" AS t0_r33, "topics"."illegal_count" AS t0_r34, "topics"."inappropriate_count" AS t0_r35, "topics"."pinned_at" AS t0_r36, "topics"."score" AS t0_r37, "topics"."percent_rank" AS t0_r38, "topics"."notify_user_count" AS t0_r39, "topics"."subtype" AS t0_r40, "topics"."slug" AS t0_r41, "topics"."auto_close_at" AS t0_r42, "topics"."auto_close_user_id" AS t0_r43, "topics"."auto_close_started_at" AS t0_r44, "topics"."deleted_by_id" AS t0_r45, "topics"."participant_count" AS t0_r46, "topics"."word_count" AS t0_r47, "topics"."excerpt" AS t0_r48, "topics"."pinned_globally" AS t0_r49, "categories"."id" AS t1_r0, "categories"."name" AS t1_r1, "categories"."color" AS t1_r2, "categories"."topic_id" AS t1_r3, "categories"."topic_count" AS t1_r4, "categories"."created_at" AS t1_r5, "categories"."updated_at" AS t1_r6, "categories"."user_id" AS t1_r7, "categories"."topics_year" AS t1_r8, "categories"."topics_month" AS t1_r9, "categories"."topics_week" AS t1_r10, "categories"."slug" AS t1_r11, "categories"."description" AS t1_r12, "categories"."text_color" AS t1_r13, "categories"."read_restricted" AS t1_r14, "categories"."auto_close_hours" AS t1_r15, "categories"."post_count" AS t1_r16, "categories"."latest_post_id" AS t1_r17, "categories"."latest_topic_id" AS t1_r18, "categories"."position" AS t1_r19, "categories"."parent_category_id" AS t1_r20, "categories"."posts_year" AS t1_r21, "categories"."posts_month" AS t1_r22, "categories"."posts_week" AS t1_r23, "categories"."email_in" AS t1_r24, "categories"."email_in_allow_strangers" AS t1_r25, "categories"."topics_day" AS t1_r26, "categories"."posts_day" AS t1_r27, "categories"."logo_url" AS t1_r28, "categories"."background_url" AS t1_r29, "categories"."allow_badges" AS t1_r30, "categories"."name_lower" AS t1_r31 FROM "topics" LEFT OUTER JOIN "categories" ON "categories"."id" = "topics"."category_id" LEFT OUTER JOIN topic_users AS tu ON (topics.id = tu.topic_id AND tu.user_id = 1) WHERE ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."closed" = 'f' AND "topics"."archived" = 'f' AND (topics.id NOT IN (65335,39,94,2,44,36,101,80,3,74,95,110,35,45,1,78,51,40,41,38,85,37,86,43,76,98,100,69,79,70,54,49,88,56,48,102,103,114,108,111,113,112,83,107,109,105,104,106,93,82,77,90,96,97,60,66,53,91,47,42,57,59,46,92,75,68,99,63,58,50,65,64,81,52,73,84,87,89,61,67,72,62,71,55)) AND (NOT EXISTS(
                         SELECT 1 FROM category_users cu
                         WHERE cu.user_id = 1 AND
                               cu.category_id = topics.category_id AND
                               cu.notification_level = 0 AND
                               cu.category_id <> -1

                         )) AND (topics.id NOT IN (65335)) AND "topics"."visible" = 't'  ORDER BY CASE WHEN topics.category_id = 42 THEN 0 ELSE 1 END, RANDOM() LIMIT 5 

Explain:

 Limit  (cost=18996.66..18996.67 rows=5 width=1285)
   ->  Sort  (cost=18996.66..19065.55 rows=27557 width=1285)
         Sort Key: (CASE WHEN (topics.category_id = 42) THEN 0 ELSE 1 END), (random())
         ->  Hash Left Join  (cost=60.88..18538.95 rows=27557 width=1285)
               Hash Cond: (topics.category_id = categories.id)
               ->  Hash Anti Join  (cost=40.99..18110.73 rows=27557 width=487)
                     Hash Cond: (topics.category_id = cu.category_id)
                     ->  Seq Scan on topics  (cost=0.00..17676.84 rows=39557 width=487)
                           Filter: ((deleted_at IS NULL) AND (NOT closed) AND (NOT archived) AND visible AND ((archetype)::text <> 'private_message'::text) AND (id <> 65335) AND (id <> ALL ('{65335,39,94,2,44,36,101,80,3,74,95,110,35,45,1,78,51,40,41,38,85,37,86,43,76,98,100,69,79,70,54,49,88,56,48,102,103,114,108,111,113,112,83,107,109,105,104,106,93,82,77,90,96,97,60,66,53,91,47,42,57,59,46,92,75,68,99,63,58,50,65,64,81,52,73,84,87,89,61,67,72,62,71,55}'::integer[])))
                     ->  Hash  (cost=40.98..40.98 rows=1 width=4)
                           ->  Seq Scan on category_users cu  (cost=0.00..40.98 rows=1 width=4)
                                 Filter: ((category_id <> (-1)) AND (user_id = 1) AND (notification_level = 0))
               ->  Hash  (cost=18.84..18.84 rows=84 width=798)
                     ->  Seq Scan on categories  (cost=0.00..18.84 rows=84 width=798)
(14 rows)

Cheers!


Reporting new Slow Queries / Performance
(Kane York) #38

Question: did that query get slower, or are you reporting it just because it changed?

I recently removed part of that query (a second select on the topics table), so I would hope that it was improved.


(Dean Taylor) #39

I am reporting it because it is slow and is impacting the overall user experience in terms of performance.


(Kane York) #40

I knew it was bad, that was why I was trying to improve it…

I was asking to see if I introduced a regression.


(Dean Taylor) #41

Well - just went live with a site…

didn’t notice the performance issues during small group testing, I can’t tell if it is a regression as both amount of data (imported posts, messages, users) and actual users active on site has changed.

Hope this makes it clear.


(Dean Taylor) #42

Here is another I’ll call it Number 3:

Executing action: show
T+141.3 ms
Reader
2566.8 ms

app/serializers/user_serializer.rb:180:in `number_of_deleted_posts'
app/controllers/application_controller.rb:202:in `render_json_dump'
app/controllers/users_controller.rb:35:in `block (2 levels) in show'
app/controllers/users_controller.rb:29:in `show'
lib/middleware/anonymous_cache.rb:119:in `call'
lib/middleware/unicorn_oobgc.rb:95:in `process_client'
SELECT COUNT(*) FROM "posts"  WHERE "posts"."user_id" = 1 AND "posts"."user_deleted" = 'f' AND ("posts"."deleted_by_id" != 1)   

And some timings and explain:

discourse=> \timing
Timing is on.
discourse=> SELECT COUNT(*) FROM posts WHERE user_id = 1 AND user_deleted = 'f' AND (deleted_by_id != 1);
 count
-------
     0
(1 row)

Time: 1866.460 ms
discourse=> explain SELECT COUNT(*) FROM posts WHERE user_id = 1 AND user_deleted = 'f' AND (deleted_by_id != 1);
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Aggregate  (cost=49246.53..49246.54 rows=1 width=0)
   ->  Seq Scan on posts  (cost=0.00..49246.30 rows=93 width=0)
         Filter: ((NOT user_deleted) AND (deleted_by_id <> 1) AND (user_id = 1))
(3 rows)

Time: 4.727 ms
discourse=>

(Benjamin Kampmann) #43

Hey @DeanMarkTaylor,

I’ve been working on exactly these three problem and I found the following solutions:

For the first one, it appears that we are missing some indexes. Doing the following migration dropped the time frame here to 0.5ms (also for other types of sorting):

class AddPerformanceIndexesToTopics < ActiveRecord::Migration
  def change
    add_index :topics, :created_at
    add_index :topics, :participant_count
    add_index :topics, :posts_count
    add_index :topics, :like_count
    add_index :topics, :views
  end
end


For suggested Topics (No 2), I found that the performance is fine if we just drop the entire idea of doing the randomizing in the database (which IMHO doesn’t make much sense in the first place). The following patch just pulls 5 times as many items and then randomizes on the ruby site. It drops the reader time down to 50ms:

(Note: the patch might be outdated)

diff --git a/lib/suggested_topics_builder.rb b/lib/suggested_topics_builder.rb
index 3051fdf..f945376 100644
--- a/lib/suggested_topics_builder.rb
+++ b/lib/suggested_topics_builder.rb
@@ -55,7 +55,7 @@ class SuggestedTopicsBuilder
   end

   def results
-    @results.first(SiteSetting.suggested_topics)
+    @results.first(SiteSetting.suggested_topics * 5).sample(SiteSetting.suggested_topics)
   end

   def results_left
diff --git a/lib/topic_query.rb b/lib/topic_query.rb
index 0da5b13..9c04d09 100644
--- a/lib/topic_query.rb
+++ b/lib/topic_query.rb
@@ -344,16 +344,17 @@ class TopicQuery
     end

     def random_suggested(topic, count, excluded_topic_ids=[])
-      result = default_results(unordered: true, per_page: count).where(closed: false, archived: false)
+      result = default_results(unordered: true, per_page: count * 5).where(closed: false, archived: false)
       excluded_topic_ids += Category.pluck(:topic_id).compact
       result = result.where("topics.id NOT IN (?)", excluded_topic_ids) unless excluded_topic_ids.empty?


For the last one, if I am not mistaken, this is only happening for administrator accounts. Meaning only admins have to wait so long, not the usual or anonymous user. But still, just adding the right index and it runs smoothly on 10ms:

CREATE INDEX posts_user_id_deleted_by on posts(user_id, deleted_by_id) WHERE NOT posts.user_deleted;

Improve performance of Topic load - ORDER BY RANDOM
(Dean Taylor) #44

Hi @lightyear nice work…

Some same thoughts - glad you found the time to put into it.

I suppose the question is are you going to submit pull requests for these?
I couldn’t spot anything in the commit log or pull queue.

Here are a few more slow queries to report - they may be related to the same index / randomising issues:

Number 4

Executing action: show
T+695.6 ms
Reader
212.4 ms

lib/suggested_topics_builder.rb:29:in `add_results'
lib/topic_query.rb:58:in `list_suggested_for'
lib/topic_view.rb:244:in `suggested_topics'
app/serializers/topic_view_serializer.rb:83:in `details'
app/controllers/application_controller.rb:202:in `render_json_dump'
app/controllers/topics_controller.rb:433:in `block (2 levels) in perform_show_response'
app/controllers/topics_controller.rb:426:in `perform_show_response'
app/controllers/topics_controller.rb:63:in `show'
lib/middleware/anonymous_cache.rb:119:in `call'
lib/middleware/unicorn_oobgc.rb:95:in `process_client'
SELECT  "topics"."id" AS t0_r0, "topics"."title" AS t0_r1, "topics"."last_posted_at" AS t0_r2, "topics"."created_at" AS t0_r3, "topics"."updated_at" AS t0_r4, "topics"."views" AS t0_r5, "topics"."posts_count" AS t0_r6, "topics"."user_id" AS t0_r7, "topics"."last_post_user_id" AS t0_r8, "topics"."reply_count" AS t0_r9, "topics"."featured_user1_id" AS t0_r10, "topics"."featured_user2_id" AS t0_r11, "topics"."featured_user3_id" AS t0_r12, "topics"."avg_time" AS t0_r13, "topics"."deleted_at" AS t0_r14, "topics"."highest_post_number" AS t0_r15, "topics"."image_url" AS t0_r16, "topics"."off_topic_count" AS t0_r17, "topics"."like_count" AS t0_r18, "topics"."incoming_link_count" AS t0_r19, "topics"."bookmark_count" AS t0_r20, "topics"."star_count" AS t0_r21, "topics"."category_id" AS t0_r22, "topics"."visible" AS t0_r23, "topics"."moderator_posts_count" AS t0_r24, "topics"."closed" AS t0_r25, "topics"."archived" AS t0_r26, "topics"."bumped_at" AS t0_r27, "topics"."has_summary" AS t0_r28, "topics"."vote_count" AS t0_r29, "topics"."archetype" AS t0_r30, "topics"."featured_user4_id" AS t0_r31, "topics"."notify_moderators_count" AS t0_r32, "topics"."spam_count" AS t0_r33, "topics"."illegal_count" AS t0_r34, "topics"."inappropriate_count" AS t0_r35, "topics"."pinned_at" AS t0_r36, "topics"."score" AS t0_r37, "topics"."percent_rank" AS t0_r38, "topics"."notify_user_count" AS t0_r39, "topics"."subtype" AS t0_r40, "topics"."slug" AS t0_r41, "topics"."auto_close_at" AS t0_r42, "topics"."auto_close_user_id" AS t0_r43, "topics"."auto_close_started_at" AS t0_r44, "topics"."deleted_by_id" AS t0_r45, "topics"."participant_count" AS t0_r46, "topics"."word_count" AS t0_r47, "topics"."excerpt" AS t0_r48, "topics"."pinned_globally" AS t0_r49, "categories"."id" AS t1_r0, "categories"."name" AS t1_r1, "categories"."color" AS t1_r2, "categories"."topic_id" AS t1_r3, "categories"."topic_count" AS t1_r4, "categories"."created_at" AS t1_r5, "categories"."updated_at" AS t1_r6, "categories"."user_id" AS t1_r7, "categories"."topics_year" AS t1_r8, "categories"."topics_month" AS t1_r9, "categories"."topics_week" AS t1_r10, "categories"."slug" AS t1_r11, "categories"."description" AS t1_r12, "categories"."text_color" AS t1_r13, "categories"."read_restricted" AS t1_r14, "categories"."auto_close_hours" AS t1_r15, "categories"."post_count" AS t1_r16, "categories"."latest_post_id" AS t1_r17, "categories"."latest_topic_id" AS t1_r18, "categories"."position" AS t1_r19, "categories"."parent_category_id" AS t1_r20, "categories"."posts_year" AS t1_r21, "categories"."posts_month" AS t1_r22, "categories"."posts_week" AS t1_r23, "categories"."email_in" AS t1_r24, "categories"."email_in_allow_strangers" AS t1_r25, "categories"."topics_day" AS t1_r26, "categories"."posts_day" AS t1_r27, "categories"."logo_url" AS t1_r28, "categories"."background_url" AS t1_r29, "categories"."allow_badges" AS t1_r30, "categories"."name_lower" AS t1_r31 FROM "topics" LEFT OUTER JOIN "categories" ON "categories"."id" = "topics"."category_id" LEFT OUTER JOIN topic_users AS tu ON (topics.id = tu.topic_id AND tu.user_id = 1) WHERE ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND (topics.created_at >= '2014-09-11 21:40:43.377495') AND (tu.last_read_post_number IS NULL) AND (COALESCE(tu.notification_level, 2) >= 2) AND (NOT EXISTS(
                         SELECT 1 FROM category_users cu
                         WHERE cu.user_id = 1 AND
                               cu.category_id = topics.category_id AND
                               cu.notification_level = 0 AND
                               cu.category_id <> -1
                         )) AND (topics.id NOT IN (65335)) AND "topics"."visible" = 't'  ORDER BY CASE WHEN topics.category_id = 42 THEN 0 ELSE 1 END, CASE
        WHEN topics.pinned_globally and (COALESCE(topics.pinned_at, '2010-01-01') > COALESCE(tu.cleared_pinned_at, '2010-01-01'))
          THEN 'infinity'
        ELSE topics.bumped_at
       END DESC LIMIT 5   

Explain:

Limit  (cost=11080.22..11080.22 rows=1 width=1293)   ->  Sort  (cost=11080.22..11080.22 rows=1 width=1293)
         Sort Key: (CASE WHEN (topics.category_id = 42) THEN 0 ELSE 1 END), (CASE WHEN (topics.pinned_globally AND (COALESCE(topics.pinned_at, '2010-01-01 00:00:00'::timestamp without time zone) > COALESCE(tu.cleared_pinned_at, '2010-01-01 00:00:00'::timestamp without time zone))) THEN 'infinity'::timestamp without time zone ELSE topics.bumped_at END)
         ->  Nested Loop Left Join  (cost=0.42..11080.21 rows=1 width=1293)
               Join Filter: (categories.id = topics.category_id)
               ->  Nested Loop Left Join  (cost=0.42..11060.31 rows=1 width=495)
                     Filter: ((tu.last_read_post_number IS NULL) AND (COALESCE(tu.notification_level, 2) >= 2))
                     ->  Nested Loop Anti Join  (cost=0.00..11026.50 rows=4 width=487)
                           Join Filter: (cu.category_id = topics.category_id)
                           ->  Seq Scan on topics  (cost=0.00..10985.43 rows=6 width=487)
                                 Filter: ((deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND (created_at >= '2014-09-11 21:40:43.377495'::timestamp without time zone) AND (id <> 65335))
                           ->  Materialize  (cost=0.00..40.98 rows=1 width=4)
                                 ->  Seq Scan on category_users cu  (cost=0.00..40.98 rows=1 width=4)
                                       Filter: ((category_id <> (-1)) AND (user_id = 1) AND (notification_level = 0))
                     ->  Index Scan using index_topic_users_on_user_id_and_topic_id on topic_users tu  (cost=0.42..8.44 rows=1 width=20)
                           Index Cond: ((user_id = 1) AND (topics.id = topic_id))
               ->  Seq Scan on categories  (cost=0.00..18.84 rows=84 width=798)
(17 rows)

Number 5

Executing action: latest
T+422.3 ms
Reader
200.8 ms

app/controllers/application_controller.rb:275:in `banner_json'
app/controllers/application_controller.rb:248:in `preload_anonymous_data'
app/controllers/application_controller.rb:151:in `preload_json'
lib/middleware/anonymous_cache.rb:119:in `call'
lib/middleware/unicorn_oobgc.rb:95:in `process_client'
SELECT  "topics".* FROM "topics"  WHERE ("topics"."deleted_at" IS NULL) AND "topics"."archetype" = 'banner'  ORDER BY "topics"."id" ASC LIMIT 1   

Explain:

Limit  (cost=10659.03..10659.04 rows=1 width=487)   ->  Sort  (cost=10659.03..10659.04 rows=1 width=487)
         Sort Key: id
         ->  Seq Scan on topics  (cost=0.00..10659.02 rows=1 width=487)
               Filter: ((deleted_at IS NULL) AND ((archetype)::text = 'banner'::text))
(5 rows)

Number 6
Executing action: latest
T+779.2 ms
Reader
168.8 ms

lib/freedom_patches/active_record_base.rb:7:in `exec_sql'
lib/sql_builder.rb:67:in `exec'
lib/sql_builder.rb:85:in `map_exec'
app/models/topic_tracking_state.rb:171:in `report'
app/controllers/application_controller.rb:253:in `preload_current_user_data'
app/controllers/application_controller.rb:154:in `preload_json'
lib/middleware/anonymous_cache.rb:119:in `call'
lib/middleware/unicorn_oobgc.rb:95:in `process_client'
WITH x AS (    SELECT u.id AS user_id,
           topics.id AS topic_id,
           topics.created_at,
           highest_post_number,
           last_read_post_number,
           c.id AS category_id,
           tu.notification_level
    FROM users u
    INNER JOIN user_stats AS us ON us.user_id = u.id
    FULL OUTER JOIN topics ON 1=1
    LEFT JOIN topic_users tu ON tu.topic_id = topics.id AND tu.user_id = u.id
    LEFT JOIN categories c ON c.id = topics.category_id
    WHERE u.id IN (1) AND
          topics.archetype <> 'private_message' AND
          (("topics"."deleted_at" IS NULL AND tu.last_read_post_number < topics.highest_post_number AND COALESCE(tu.notification_level, 1) >= 2) OR ("topics"."deleted_at" IS NULL AND topics.created_at >= GREATEST(CASE
                  WHEN COALESCE(u.new_topic_duration_minutes, 2880) = -1 THEN u.created_at
                  WHEN COALESCE(u.new_topic_duration_minutes, 2880) = -2 THEN COALESCE(u.previous_visit_at,u.created_at)
                  ELSE ('2014-09-13 21:34:57.252154'::timestamp - INTERVAL '1 MINUTE' * COALESCE(u.new_topic_duration_minutes, 2880))
               END, us.new_since) AND tu.last_read_post_number IS NULL AND COALESCE(tu.notification_level, 2) >= 2)) AND
          (topics.visible OR u.admin OR u.moderator) AND
          topics.deleted_at IS NULL AND
          ( category_id IS NULL OR NOT c.read_restricted OR category_id IN (
              SELECT c2.id FROM categories c2
              JOIN category_groups cg ON cg.category_id = c2.id
              JOIN group_users gu ON gu.user_id = u.id AND cg.group_id = gu.group_id
              WHERE c2.read_restricted )
          )
          AND NOT EXISTS( SELECT 1 FROM category_users cu
                          WHERE cu.user_id = u.id AND
                               cu.category_id = topics.category_id AND
                               cu.notification_level = 0)

 ORDER BY topics.bumped_at DESC ) SELECT * FROM x LIMIT 500   

Explain:

Limit  (cost=424863.67..424873.67 rows=500 width=32)   CTE x
     ->  Sort  (cost=424857.62..424863.67 rows=2421 width=40)
           Sort Key: topics.bumped_at
           ->  Hash Left Join  (cost=11727.15..424721.54 rows=2421 width=40)
                 Hash Cond: (topics.category_id = c.id)
                 Filter: ((topics.category_id IS NULL) OR (NOT c.read_restricted) OR (SubPlan 1))
                 ->  Hash Anti Join  (cost=11707.26..11964.44 rows=3195 width=40)
                       Hash Cond: ((u.id = cu.user_id) AND (topics.category_id = cu.category_id))
                       ->  Hash Right Join  (cost=11670.70..11868.72 rows=4586 width=40)
                             Hash Cond: ((tu.user_id = u.id) AND (tu.topic_id = topics.id))
                             Filter: (((tu.last_read_post_number < topics.highest_post_number) AND (COALESCE(tu.notification_level, 1) >= 2)) OR ((topics.created_at >= GREATEST(CASE WHEN (COALESCE(u.new_topic_duration_minutes, 2880) = (-1)) THEN u.created_at WHEN (COALESCE(u.new_topic_duration_minutes, 2880) = (-2)) THEN COALESCE(u.previous_visit_at, u.created_at) ELSE ('2014-09-13 21:34:57.252154'::timestamp without time zone - ('00:01:00'::interval * (COALESCE(u.new_topic_duration_minutes, 2880))::double precision)) END, us.new_since)) AND (tu.last_read_post_number IS NULL) AND (COALESCE(tu.notification_level, 2) >= 2)))
                             ->  Bitmap Heap Scan on topic_users tu  (cost=4.83..200.23 rows=53 width=16)
                                   Recheck Cond: (user_id = 1)
                                   ->  Bitmap Index Scan on index_topic_users_on_user_id_and_topic_id  (cost=0.00..4.82 rows=53 width=0)
                                         Index Cond: (user_id = 1)
                             ->  Hash  (cost=11071.73..11071.73 rows=39609 width=60)
                                   ->  Nested Loop  (cost=0.57..11071.73 rows=39609 width=60)
                                         Join Filter: (topics.visible OR u.admin OR u.moderator)
                                         ->  Nested Loop  (cost=0.57..16.62 rows=1 width=34)
                                               ->  Index Scan using users_pkey on users u  (cost=0.29..8.30 rows=1 width=26)
                                                     Index Cond: (id = 1)
                                               ->  Index Scan using user_stats_pkey on user_stats us  (cost=0.29..8.30 rows=1 width=12)
                                                     Index Cond: (user_id = 1)
                                         ->  Seq Scan on topics  (cost=0.00..10659.02 rows=39609 width=29)
                                               Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text))
                       ->  Hash  (cost=36.55..36.55 rows=1 width=8)
                             ->  Seq Scan on category_users cu  (cost=0.00..36.55 rows=1 width=8)
                                   Filter: ((user_id = 1) AND (notification_level = 0))
                 ->  Hash  (cost=18.84..18.84 rows=84 width=5)
                       ->  Seq Scan on categories c  (cost=0.00..18.84 rows=84 width=5)
                 SubPlan 1
                   ->  Nested Loop  (cost=0.14..425.54 rows=1 width=4)
                         ->  Nested Loop  (cost=0.00..425.10 rows=1 width=4)
                               Join Filter: (cg.group_id = gu.group_id)
                               ->  Seq Scan on category_groups cg  (cost=0.00..9.97 rows=297 width=8)
                               ->  Materialize  (cost=0.00..406.22 rows=2 width=4)
                                      ->  Seq Scan on group_users gu  (cost=0.00..406.21 rows=2 width=4)

(Benjamin Kampmann) #45

We currently have a big release coming up that has priority. And as we running a discourse fork (because we needed patches in core to allow us to build certain features), which ran out of sync, my performances patches aren’t 100% compatible at the moment. I hope to have some time to back port them to DC main next month…


(Sol) #46

Great idea with the question project :slight_smile:


(Dean Taylor) #47

Excellent stuff @lightyear - thanks for providing your feedback on the performance issues here.

And highlighting your project, I’m looking forward to seeing more updates regarding your fork.

Looking forward to these slow queries being resolved.


(Kane York) #48

Number 4 is a TopicQuery, just like Number 1 and 2. Same query, so same optimizations to make.

Number 5 (banner) should be able to be converted into a simple lookup instead of a table scan, though – there’s only ever one banner topic!
Maybe a partial index…?


(Benjamin Kampmann) #49

solving it here with redis-caching of the serialized JSON that gets deleted when ever a banner-topic is saved. Making it essentially a one-time query lookup and then a 3ms redis. Works more than fine.


(Sam Saffron) #50

FYI, latest query in the OP is now resolved.

Suggested topics query is hard to solve but we will get there.


(Sam Saffron) #51

…and … suggested topics was resolved a month ago or so.

closing