Slow Page Loads on User Profiles

We’ve completed an import of users posts from our current software to Discourse, and one thing we’ve noticed is that some of the user pages are very slow to load (several seconds) and this looks to be due to slow queries. How can we troubleshoot this and identify the bottleneck?

Here’s some more information about our setup:

Server Specs: 8 cores, 32 GB RAM, NVMe SSD storage with ZFS

The board has about 25 million posts. Most of these are private conversations, with about 10.5 million being public.
We do have a number of megatopics (we’re a writing site so individual stories tend to evolve slowly over time, stacking up lots of posts) with thousands of posts, so this might be contributing, but I don’t know how to confirm this.

Here is one query that appears to be trouble - I have seen it at over 5000ms in the built-in load analyzer:

SELECT "topic_links"."id" AS t0_r0, "topic_links"."topic_id" AS t0_r1, "topic_links"."post_id" AS t0_r2, "topic_links"."user_id" AS t0_r3, "topic_links"."url" AS t0_r4, "topic_links"."domain" AS t0_r5, "topic_links"."internal" AS t0_r6, "topic_links"."link_topic_id" AS t0_r7, "topic_links"."created_at" AS t0_r8, "topic_links"."updated_at" AS t0_r9, "topic_links"."reflection" AS t0_r10, "topic_links"."clicks" AS t0_r11, "topic_links"."link_post_id" AS t0_r12, "topic_links"."title" AS t0_r13, "topic_links"."crawled_at" AS t0_r14, "topic_links"."quote" AS t0_r15, "topic_links"."extension" AS t0_r16, "topics"."id" AS t1_r0, "topics"."title" AS t1_r1, "topics"."last_posted_at" AS t1_r2, "topics"."created_at" AS t1_r3, "topics"."updated_at" AS t1_r4, "topics"."views" AS t1_r5, "topics"."posts_count" AS t1_r6, "topics"."user_id" AS t1_r7, "topics"."last_post_user_id" AS t1_r8, "topics"."reply_count" AS t1_r9, "topics"."featured_user1_id" AS t1_r10, "topics"."featured_user2_id" AS t1_r11, "topics"."featured_user3_id" AS t1_r12, "topics"."deleted_at" AS t1_r13, "topics"."highest_post_number" AS t1_r14, "topics"."like_count" AS t1_r15, "topics"."incoming_link_count" AS t1_r16, "topics"."category_id" AS t1_r17, "topics"."visible" AS t1_r18, "topics"."moderator_posts_count" AS t1_r19, "topics"."closed" AS t1_r20, "topics"."archived" AS t1_r21, "topics"."bumped_at" AS t1_r22, "topics"."has_summary" AS t1_r23, "topics"."archetype" AS t1_r24, "topics"."featured_user4_id" AS t1_r25, "topics"."notify_moderators_count" AS t1_r26, "topics"."spam_count" AS t1_r27, "topics"."pinned_at" AS t1_r28, "topics"."score" AS t1_r29, "topics"."percent_rank" AS t1_r30, "topics"."subtype" AS t1_r31, "topics"."slug" AS t1_r32, "topics"."deleted_by_id" AS t1_r33, "topics"."participant_count" AS t1_r34, "topics"."word_count" AS t1_r35, "topics"."excerpt" AS t1_r36, "topics"."pinned_globally" AS t1_r37, "topics"."pinned_until" AS t1_r38, "topics"."fancy_title" AS t1_r39, "topics"."highest_staff_post_number" AS t1_r40, "topics"."featured_link" AS t1_r41, "topics"."reviewable_score" AS t1_r42, "topics"."image_upload_id" AS t1_r43, "posts"."id" AS t2_r0, "posts"."user_id" AS t2_r1, "posts"."topic_id" AS t2_r2, "posts"."post_number" AS t2_r3, "posts"."raw" AS t2_r4, "posts"."cooked" AS t2_r5, "posts"."created_at" AS t2_r6, "posts"."updated_at" AS t2_r7, "posts"."reply_to_post_number" AS t2_r8, "posts"."reply_count" AS t2_r9, "posts"."quote_count" AS t2_r10, "posts"."deleted_at" AS t2_r11, "posts"."off_topic_count" AS t2_r12, "posts"."like_count" AS t2_r13, "posts"."incoming_link_count" AS t2_r14, "posts"."bookmark_count" AS t2_r15, "posts"."score" AS t2_r16, "posts"."reads" AS t2_r17, "posts"."post_type" AS t2_r18, "posts"."sort_order" AS t2_r19, "posts"."last_editor_id" AS t2_r20, "posts"."hidden" AS t2_r21, "posts"."hidden_reason_id" AS t2_r22, "posts"."notify_moderators_count" AS t2_r23, "posts"."spam_count" AS t2_r24, "posts"."illegal_count" AS t2_r25, "posts"."inappropriate_count" AS t2_r26, "posts"."last_version_at" AS t2_r27, "posts"."user_deleted" AS t2_r28, "posts"."reply_to_user_id" AS t2_r29, "posts"."percent_rank" AS t2_r30, "posts"."notify_user_count" AS t2_r31, "posts"."like_score" AS t2_r32, "posts"."deleted_by_id" AS t2_r33, "posts"."edit_reason" AS t2_r34, "posts"."word_count" AS t2_r35, "posts"."version" AS t2_r36, "posts"."cook_method" AS t2_r37, "posts"."wiki" AS t2_r38, "posts"."baked_at" AS t2_r39, "posts"."baked_version" AS t2_r40, "posts"."hidden_at" AS t2_r41, "posts"."self_edits" AS t2_r42, "posts"."reply_quoted" AS t2_r43, "posts"."via_email" AS t2_r44, "posts"."raw_email" AS t2_r45, "posts"."public_version" AS t2_r46, "posts"."action_code" AS t2_r47, "posts"."locked_by_id" AS t2_r48, "posts"."image_upload_id" AS t2_r49 FROM "topic_links" INNER JOIN "topics" ON "topics"."id" = "topic_links"."topic_id" AND ("topics"."deleted_at" IS NULL) INNER JOIN "posts" ON "posts"."id" = "topic_links"."post_id" AND ("posts"."deleted_at" IS NULL) WHERE (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (3,4))) AND "topic_links"."user_id" = 7237 AND "topic_links"."internal" = FALSE AND "topic_links"."reflection" = FALSE AND "topic_links"."quote" = FALSE ORDER BY clicks DESC, topic_links.created_at DESC LIMIT 6

When watching the server while loading a user page, it seems to just be spiking CPU usage on the postgresql processes while the query executes. I have not made any configuration changes to the install other than to add MySQL support for the import process. Any advice on next steps would be appreciated!

7 Likes

Putting this here for reference. I’ve added 4 indexes to the database, and that has greatly improved performance, though I think it could still be better. This is what I’ve added:

CREATE INDEX index_topic_links_on_clicks_and_created ON public.topic_links USING btree (clicks, created_at);
CREATE INDEX index_posts_on_like_count_and_created ON public.posts USING btree (like_count, created_at);
CREATE INDEX index_topic_links_on_clicks_and_created_desc ON public.topic_links USING btree (clicks DESC, created_at DESC);
CREATE INDEX index_posts_on_like_count_and_created_desc ON public.posts USING btree (like_count DESC, created_at DESC, user_id) WHERE deleted_at IS NULL AND post_number > 1 AND (post_type = ANY ('{1,2,3,4}'::integer[]));
6 Likes

Hmm are we missing some indexes here @tgxworld?

2 Likes

Hi @Ghan,

thank you very much for these indexes. We have the same issue for a while now and couldn’t solve this.
After applying these indexes loading user profiles improved drastically. There were some “worst case” profiles which took up to 30s to load. Now they show up in ~5s.

So thanks again and I hope there is some more potential to speed this up.

Kind Regards
Sascha

P.S. Even the /about - page takes up to ~20s to load. May this could be speed up as well.

1 Like

Asking again. Are we missing these indexes @tgxworld @sam?

2 Likes

Let me check the plans for those queries in Meta.

Queries are quite fast on Meta, but I can see how they can perform badly on under powered databases. We are using ActiveRecord and the code is quite clean, but the generated SQL is quite messy.

Doing the SQL manually we can make the query use existing indexes and optimize it a lot:

Before

After

In the optimized query we only retrieve posts from the user, limiting the loop of posts for the join happening too late.

So I believe we can get better performance without adding index overhead.

7 Likes

Can the rewritten version be done in the ActiveRecord code or would you need to pass in a manual query string here?

I hope so because the current Ruby code is beautiful. But if not we have a very simple pattern for doing SQL in the app that is already widely used.

1 Like

It does seem like we’re missing indexes on alot of these queries running on the user summary page. Will need to optimize these queries one by one.

@Falco I see you’ve assigned yourself to this, do you still want to take it? The tricky part here is that these queries are all mostly untested so we have to be careful not to break stuff.

3 Likes

I added a draft commit for testing the serializer output here so we don’t fly blind:

https://github.com/discourse/discourse/pull/10175/commits/791a1cfa1822096e9d870f1373c91ad143c0c2d9

After that, I tried re-applying what I did here but turns out I was adding an extra clause that was making it fast but incorrect.

The fact that we pass all this (even the most liked/ liked by) through the guardian means the info is always very correct and tailored to the viewer, but makes caching it not helping against multiple users.

I guess we can at least add a longer cache to this endpoint? In my opinion a 1 hour cache would be adequate, as the information in the summary doesn’t change often.

4 Likes

Sounds good… make it so!

1 Like

I pushed the cache to my PR branch:

https://github.com/discourse/discourse/pull/10175

While debugging further, I noticed that we also have some N+1 in there. Thumbnails and Solved are triggering a N+1 for the topic lists. Any ideas for it @david @tgxworld ?

2 Likes

Merged the cache work.

Before After
image

Still working on N+1.

7 Likes

Which serializer does the user summary page use for topic lists? For the full page search, I recently fixed the N+1 by dropping the thumbnails attributes from the search serializer because it isn’t used at all.

3 Likes

Fixed in https://github.com/discourse/discourse/pull/11238/files and https://github.com/discourse/discourse/pull/11236

4 Likes