Improve performance of Topic load - ORDER BY RANDOM

Continuing the discussion from Slow sql queries:

##The Problem
Clicking between topics is slow - for a few reasons - one of which is the generation of suggested topics.

For me I currently consider this a bug because suggested topics currently adds a ~640ms additional query delay to the initial display of a topic which is not critical to that topics display.

##What’s the impact?
For me with 40K topics and 430K posts on a 2GB Digital Ocean instance I’m seeing almost 500ms added to each topic load just from the “random” query for suggested topics.

##Problem Detail
Currently the function TopicQuery::list_suggested_for generates a list of random topics to augment the unread and new topics listed at the bottom of the each topic page.

The random element of this uses ORDER BY RANDOM in the SQL query, this is an easy but slow way of getting a random list of rows from the database.

Here is the function:
https://github.com/discourse/discourse/blob/738f2cb72c295c63c3581ae69d990344d9b947a8/lib/topic_query.rb#L55-L63

##The Suggestions Queries
As an example here are the numbers for a slightly faster than typical load - perhaps due to being cached.

###New Topic Suggestions
So new topic suggestions query is sorted as follows:

ORDER BY CASE WHEN topics.category_id = 79 THEN 0 ELSE 1 END, topics.bumped_at DESC LIMIT 5 

This “new” query takes 120.1 ms

:straight_ruler: Screen capture of "new" query times (click to expand)

###Random Suggestions

ORDER BY CASE WHEN topics.category_id = 79 THEN 0 ELSE 1 END, RANDOM() LIMIT 5 

This “random” query takes 479.3 ms - this was a quite fast example.

:straight_ruler: Screen capture of "random" query times (click to expand)

##Solution / Alternative to ORDER BY RANDOM
There are several alternatives to ORDER BY RANDOM, one of which is this:

Linked from here:

1 Like

I believe @sam talked about adding some intermediate cache tables here. It is more difficult for us because we are doing random within a category, not random across all topics.

The proposed solution is something like this:

  • have a Redis key named after the category, for example suggested:support/wordpress (category.url)
  • if we have N suggested topic slots to fill, try to RPOP n times
  • if a RPOP returns nil (the list ran out), refill it:
    • Query for a thousand topics

       SELECT id FROM TOPICS WHERE category_id = ? ORDER BY RANDOM() LIMIT 1000
      
    • LPUSH them all into the list 100 at a time (LPUSH suggested:support/wordpress 123 54 87 35 78 15 64 79 ...)

    • Expire the key in 2 days or so

  • Continue RPOPing until either we have N unique topic ids or it returns nil again.
  • If RPOP returned nil again, and the category was specified, repeat the fill attempt, but use suggested:@all instead of a category-specific one
  • Pass our topic ID list to Topic.find and then on to the serializer

This is also known as a “tetris random”.

6 Likes

That is actually quite efficient and simple, I like it.

This is now implemented per @riking’s suggestion

https://github.com/discourse/discourse/commit/b760d22460ef5eef4c81b4ba4fbf1a8cdd384558

we backfill up to 3000 random items in the list and trigger a background refresh if the list is shorter than 500.

this technique works very efficiently for categories with a large number of topics, if categories are short then we round trip some extra data to redis, but its still more efficient cause we stabalize on issuing the backfill in the background.

8 Likes

Thanks so much @sam - this update is a great improvement!

Here is a little more detail about what I get now.

###Initial Random Backfill

Executed in ~64.5ms (click to expand)

###Random Selection

Executed in ~6.2ms (click to expand)

###Total Topic Load Time
The total topic load time is now far lower at ~390ms to ~450ms


###In-category ORDER BY topics.bumped_at
This is now the slowest query in a topic load - loading somewhere between ~112ms and ~140ms.

Executed in ~112ms (click to expand)

###Question
Is there any chance of the topics.bumped_at query being cached?

1 Like

Remind me again what the topic / post / user count is for your Discourse dataset? Visiting /about should be fine.

Hmm, surprising, does not seem that much higher than other sites we have, though all Discourse sites (that are not imports) cannot be too large as they have not had sufficient years to grow.

I know @sam was testing with a dataset that had 1 million topics (but not so many posts, proportionally) just to experience maximum pain. Surprised it is an issue with 40k…

Does the count on the about page include PM’s?

There are also 101K of PM’s - but that’s posts right?

That’s a lot of PMs…

Why are there 101,000 PMs? Yes that means your topic count is effectively 140k. Still nowhere near a million.

Users like their PM’s - over 11 years of users being active.

stuff was way worse with a million, on my desktop it was 800ms for that query which is about 2 seconds on digital ocean

Out of interest in improving the topics.bumped_at list_suggested_for query I tested adding an index.

###Without Index

Executed in ~112ms (click to expand)

Executed in ~141.3ms (click to expand)

###With Index

Executed in ~13.2ms (click to expand)

Executed in ~18.7ms (click to expand)

###Index tested

CREATE INDEX idx_topic_load_suggested_for
    ON topics
    USING btree
    (deleted_at, visible, archetype COLLATE pg_catalog."default", created_at);

###Tested table / index sizes
topics row count: 66,373
topics table size: 29MB
idx_topic_load_suggested_for index size: 3,488KB

###Thoughts and questions
To me it seems far better to scan only 3.5MB of data instead of 29MB per topic page load considering how frequently it occurs.

This was just a quick pass at what I thought was a matching index for the query - I’m sure someone with more knowledge of the details could do better.

Is this an index the Discourse Team can get behind and perhaps include?
Or is there a more cached solution that’s needed?
Perhaps just a good stop-gap?

1 Like

Totally open to improving indexing, but in this case I think a filtered index ׳where deleted at is null and visible׳ should work better

I assume you mean this?

CREATE INDEX idx_topic_load_not_deleted_and_visible
    ON topics
    USING btree
    (deleted_at, visible)
    WHERE (deleted_at IS NULL AND visible)

I tested this and the index doesn’t get used, it still does a Seq Scan of the table.

edit
I tried the original index with the addition of the WHERE (deleted_at IS NULL AND visible) constraint, it had no performance or index size impact as far as I could tell. But it did prevent the table scan.

I think sam meant this:

CREATE INDEX idx_topic_load_suggested_for
    ON topics
    USING btree
    (archetype COLLATE pg_catalog."default", created_at)
    WHERE (deleted_at IS NULL AND visible);

By default indexes in postgres are using btree so we don’t need that, also collation should be good so we don’t need that… What I want you to test is:

create index idx_topics_suggested_for
on topics (visible, created_at) 
where archetype <> 'private_message' and deleted_at is null

This is a bit odd cause we are ordering on bumped_at so I would assume

create index idx_topics_suggested_for
on topics (bumped_at) 
where archetype <> 'private_message' and deleted_at is null and visible

Should work best, however admins see non-visible topics so perhaps adding it in makes sense.

What I am really struggling here with is why would you be adding deleted_at to the actual index when the only valid value for it on selection is NULL. We are never ordering on deleted_at

###Best Results
Yes this gives the best results:

create index idx_topics_suggested_for
on topics (visible, created_at) 
where archetype <> 'private_message' and deleted_at is null

Index size: 1248KB
Method: Uses Index scan
Reduces the query time to: 6.7-5.8ms from 141.3-112ms

###Other Notes
####Note 1
In this query the index is not used because of the join reducing the result set before bumped_at is used / useful.

create index idx_topics_suggested_for
on topics (bumped_at) 
where archetype <> 'private_message' and deleted_at is null and visible

Index size: 896KB

####Note 2
The following index causes “recheck condition” on the topics table using bitmap heap scan after initially reducing the result set, causing slower total query time.

create index idx_topics_suggested_for
on topics (created_at) 
where archetype <> 'private_message' and deleted_at is null and visible

Index size: 896KB

####Note 3

Yeah that would be because I’m used to not using INDEX’s that support WHERE conditions, you don’t need to be ordering on an item for it to help with index lookup performance in those cases.

3 Likes