"Executing action: latest" Taking a Long Time

slow-sql

(Kirupa Chinnathambi) #1

Hi, everyone!
A discourse forum I help run takes a really long time to load initially. When I click on the timestamps, this is what I see:

The “Executing action: latest” action takes about 3 seconds. The overall forum load time on http://forum.kirupa.com is about 5-6 seconds from entering the URL. I actually migrated the server to a 4GB DigitalOcean droplet, but that doesn’t seem to have fixed anything.

I noticed that other Discourse instances such as this one are much MUCH faster, so is there something that I need to configure differently? I read through the Initial Load Time post as well, but that thread seems to suggest using a CDN is a good enhancement. Is the delay I am seeing related to the lack of me using a CDN?

Thanks,
Kirupa


(Régis Hanol) #2

Would you mind clicking the 24 sql link and identifying the slowest query(ies)?


(Kirupa Chinnathambi) #3

This one took 890ms:

        WITH x AS (
        SELECT u.id AS user_id,
               topics.id AS topic_id,
               topics.created_at,
               highest_post_number,
               last_read_post_number,
               c.id AS category_id,
               tu.notification_level
        FROM users u
        INNER JOIN user_stats AS us ON us.user_id = u.id
        FULL OUTER JOIN topics ON 1=1
        LEFT JOIN topic_users tu ON tu.topic_id = topics.id AND tu.user_id = u.id
        LEFT JOIN categories c ON c.id = topics.category_id
        WHERE u.id IN (1) AND
              topics.archetype <> 'private_message' AND
              (("topics"."deleted_at" IS NULL AND tu.last_read_post_number < topics.highest_post_number AND COALESCE(tu.notification_level, 1) >= 2) OR ("topics"."deleted_at" IS NULL AND topics.created_at >= GREATEST(CASE
                      WHEN COALESCE(u.new_topic_duration_minutes, 2880) = -1 THEN u.created_at
                      WHEN COALESCE(u.new_topic_duration_minutes, 2880) = -2 THEN COALESCE(u.previous_visit_at,u.created_at)
                      ELSE ('2015-01-21 18:36:50.083141'::timestamp - INTERVAL '1 MINUTE' * COALESCE(u.new_topic_duration_minutes, 2880))
                   END, us.new_since) AND tu.last_read_post_number IS NULL AND COALESCE(tu.notification_level, 2) >= 2)) AND
              (topics.visible OR u.admin OR u.moderator) AND
              topics.deleted_at IS NULL AND
              ( category_id IS NULL OR NOT c.read_restricted OR category_id IN (
                  SELECT c2.id FROM categories c2
                  JOIN category_groups cg ON cg.category_id = c2.id
                  JOIN group_users gu ON gu.user_id = u.id AND cg.group_id = gu.group_id
                  WHERE c2.read_restricted )
              )
              AND NOT EXISTS( SELECT 1 FROM category_users cu
                              WHERE cu.user_id = u.id AND
                                   cu.category_id = topics.category_id AND
                                   cu.notification_level = 0)
    
     ORDER BY topics.bumped_at DESC ) SELECT * FROM x LIMIT 500 

This one now takes 2500ms:

SELECT  "topics"."id" AS t0_r0, "topics"."title" AS t0_r1, "topics"."last_posted_at" AS t0_r2, "topics"."created_at" AS t0_r3, "topics"."updated_at" AS t0_r4, "topics"."views" AS t0_r5, "topics"."posts_count" AS t0_r6, "topics"."user_id" AS t0_r7, "topics"."last_post_user_id" AS t0_r8, "topics"."reply_count" AS t0_r9, "topics"."featured_user1_id" AS t0_r10, "topics"."featured_user2_id" AS t0_r11, "topics"."featured_user3_id" AS t0_r12, "topics"."avg_time" AS t0_r13, "topics"."deleted_at" AS t0_r14, "topics"."highest_post_number" AS t0_r15, "topics"."image_url" AS t0_r16, "topics"."off_topic_count" AS t0_r17, "topics"."like_count" AS t0_r18, "topics"."incoming_link_count" AS t0_r19, "topics"."bookmark_count" AS t0_r20, "topics"."category_id" AS t0_r21, "topics"."visible" AS t0_r22, "topics"."moderator_posts_count" AS t0_r23, "topics"."closed" AS t0_r24, "topics"."archived" AS t0_r25, "topics"."bumped_at" AS t0_r26, "topics"."has_summary" AS t0_r27, "topics"."vote_count" AS t0_r28, "topics"."archetype" AS t0_r29, "topics"."featured_user4_id" AS t0_r30, "topics"."notify_moderators_count" AS t0_r31, "topics"."spam_count" AS t0_r32, "topics"."illegal_count" AS t0_r33, "topics"."inappropriate_count" AS t0_r34, "topics"."pinned_at" AS t0_r35, "topics"."score" AS t0_r36, "topics"."percent_rank" AS t0_r37, "topics"."notify_user_count" AS t0_r38, "topics"."subtype" AS t0_r39, "topics"."slug" AS t0_r40, "topics"."auto_close_at" AS t0_r41, "topics"."auto_close_user_id" AS t0_r42, "topics"."auto_close_started_at" AS t0_r43, "topics"."deleted_by_id" AS t0_r44, "topics"."participant_count" AS t0_r45, "topics"."word_count" AS t0_r46, "topics"."excerpt" AS t0_r47, "topics"."pinned_globally" AS t0_r48, "topics"."auto_close_based_on_last_post" AS t0_r49, "topics"."auto_close_hours" AS t0_r50, "categories"."id" AS t1_r0, "categories"."name" AS t1_r1, "categories"."color" AS t1_r2, "categories"."topic_id" AS t1_r3, "categories"."topic_count" AS t1_r4, "categories"."created_at" AS t1_r5, "categories"."updated_at" AS t1_r6, "categories"."user_id" AS t1_r7, "categories"."topics_year" AS t1_r8, "categories"."topics_month" AS t1_r9, "categories"."topics_week" AS t1_r10, "categories"."slug" AS t1_r11, "categories"."description" AS t1_r12, "categories"."text_color" AS t1_r13, "categories"."read_restricted" AS t1_r14, "categories"."auto_close_hours" AS t1_r15, "categories"."post_count" AS t1_r16, "categories"."latest_post_id" AS t1_r17, "categories"."latest_topic_id" AS t1_r18, "categories"."position" AS t1_r19, "categories"."parent_category_id" AS t1_r20, "categories"."posts_year" AS t1_r21, "categories"."posts_month" AS t1_r22, "categories"."posts_week" AS t1_r23, "categories"."email_in" AS t1_r24, "categories"."email_in_allow_strangers" AS t1_r25, "categories"."topics_day" AS t1_r26, "categories"."posts_day" AS t1_r27, "categories"."logo_url" AS t1_r28, "categories"."background_url" AS t1_r29, "categories"."allow_badges" AS t1_r30, "categories"."name_lower" AS t1_r31, "categories"."auto_close_based_on_last_post" AS t1_r32 FROM "topics" LEFT OUTER JOIN "categories" ON "categories"."id" = "topics"."category_id" LEFT OUTER JOIN topic_users AS tu ON (topics.id = tu.topic_id AND tu.user_id = 1) WHERE (topics.archetype <> 'private_message') AND (COALESCE(categories.topic_id, 0) <> topics.id) AND (topics.deleted_at IS NULL) AND (NOT EXISTS(
                         SELECT 1 FROM category_users cu
                         WHERE cu.user_id = 1 AND
                               cu.category_id = topics.category_id AND
                               cu.notification_level = 0 AND
                               cu.category_id <> -1
                         ))  ORDER BY CASE
        WHEN topics.pinned_globally
         AND (COALESCE(topics.pinned_at, '1900-01-01') > COALESCE(tu.cleared_pinned_at, '1900-01-01'))
          THEN topics.pinned_at + interval '9999 years'
          ELSE topics.bumped_at
       END DESC LIMIT 30   

I think the other queries are OK in how long they took. These are the biggest culprits.

:smile:


(Kane York) #4

Some numbers on the forum to help with diagnosis:

323,400 Topics
218,200 Users
91 Likes
52 Moderators + 2 Admins


(Sam Saffron) #5

This is a really hard problem for us to solve without a dataset that exhibits the issue. We would need a db backup to look at it.


(Kirupa Chinnathambi) #6

Sam - happy to provide it for you! What is the best way for me to provide it. Should I make you an admin on the forum? Or should I transfer the DB backup to some location that you can access?

:slight_smile:


(Sam Saffron) #7

A db backup would be perfect, pm me a link


(Kirupa Chinnathambi) #8

Thanks - just sent you a PM with a link :slight_smile:


(Sam Saffron) #9

thanks will download and have a look tomorrow, see if there are any tricks we take to make it faster.


(Sam Saffron) #10

Update:

I downloaded the dataset and can confirm the issue with latest query. Thinking through a creative way of sorting it.

The biggest issue is that our fancy (and even non-fancy) pinning logic makes the query super slow and table scan.

For example, for anon we ask the db:

Give us a list of topics ordered by “is it pinned” first and “bumped at date second” the problem is that pinning is so rare that the statistics are not being helpful here and pg decides to table scan to answer this question.

A possible workaround would be store pinning state in another table or perhaps add another index to bypass the issue, thinking about it now.

cc @eviltrout


(Jeff Atwood) #11

I suggest storing pin state elsewhere, we do not want people pinning hundreds of topics pretty much ever. So a pin will always be extremely rare.


(Sam Saffron) #12

That still does not quite solve the issue

This is super fast:

select * from topics 
order by bumped_at desc 
limit 30

This is slow:

select * from topics 
order by case when id = 100 then 0 else 1 end, bumped_at desc 
limit 30

I need to figure out how to engineer the second query so it is fast. Alternatively we can change it so we query twice (once for pinned and once for regular) and merge in code but its very very messy.


(Mittineague) #13

Isn’t asking for * everything always a bit slower than explicitly asking for only what is wanted?


(Sam Saffron) #14

The * there is the least of the problems, its just coming up with a terrible plan.

Posted on pg mailing list as well to see if anyone has ideas:

http://www.postgresql.org/message-id/CAAtdryPRK9Fb-a4-JPNB=p77jLrW7Mvs4j8VQ8-Hf5o09E49pg@mail.gmail.com


(Jeff Atwood) #15

Does not seem all that messy to me. Pins are always at the top (of the category or topic list), so provided the user had not personally unpinned, then just stick it at the top – so merging is more like “always insert at top”.


(Kane York) #16

I agree, and having less joins on the main topic query seems like the opposite of a problem to me. (And way more relevant than selecting * on just ONE table…)


(Jeff Atwood) #17

And if there are no pins, it is an empty query result, so presumably fast.


(Sam Saffron) #18

We are talking active record here, relations are leaking in and out of objects and built upon, this is very complicated to add in and would be very messy to say the least.


(Robin Ward) #19

Another vote for just do a second query and merge. It seems like it should be easy to me, but I’ll take your word for it that it’s messy :smile:


(Dean Taylor) #20

Not sure if it helps but the WordPress code kinda works like this:
(slight language translation here from “WordPress posts” to “Discourse topics” and “sticky” to “pinned”).

  1. SQL: Submit the ordered query as usual for the topics (but only select ID’s)
  2. CODE: If we have sticky topics in the results (loaded from static settings object) then order those to the top.
  3. CODE: Any missing sticky topic ID’s are added in.
  4. SQL: All topics SELECTs are then usually done in WordPress by individual topic ID for caching benefits. But obviously this isn’t the most important aspect. You could just then use the ID’s to make a 2nd query for data in one go, selecting based on your built ordered ID list.

In WordPress if there are pinned items in the result set it’s possible to get more items than you requested. So say when requesting 10 items between 2010 and 2011 but the pinned items are a lot older than than this an exist in 2009. Obviously you could filter any which way.