Slow tags query when alphabetical sort is disabled

Hi,

I found a slow query during the test.

SELECT  COUNT(topics.id) AS count_topics_id, tags.id, tags.name AS tags_id_tags_name
FROM "tags"
LEFT JOIN topic_tags ON tags.id = topic_tags.tag_id
LEFT JOIN topics ON topics.id = topic_tags.topic_id AND topics.deleted_at IS NULL
WHERE (topics.category_id in (3,5,12,11,9,6,8,1,7,10,2,4))
GROUP BY tags.id, tags.name
ORDER BY count_topics_id DESC
LIMIT 30

This query will be effective by enabling Show a dropdown a filter a topic list by tag. and disabling Show tags in alphabetical order. Configuration as below:
image

And the table topic_tags has more than 3 million records. The query will run 4 seconds.

Regards,

1 Like

What does EXPLAIN show for the query, on your database?

Explain

1 Like

@neil is working on improving tagging here so he may have already addressed this in his dev env.

3 Likes

Improving this performance is a bit tricky. I think we’ll need to change the functionality of that filter. This case is easy and can be optimized now that we normalize the number of times a tag has been used:

51 PM

But when scoped to a category, we currently re-count how many times every tag has been used in topics in that category and sub-categories and then sort.

10 PM

Also included in the query is which categories you have permission to see. Tags allowed only in private categories shouldn’t be exposed in this filter.

One option is to only show tags used in the category, but sort by their global counts instead of category-specific counts.

2 Likes

I just question the entire premise of this, is 3 million tags even a real use case?

I guess we could create a “category_tag” table and keep it up to date then sort on that. But I really want to deal with some real world data here before going crazy.

4 Likes

3 million tags isn’t remotely a target in our software. We’re looking at more like 100,000 max certainly FAR under a million.

1 Like

I think 3 million in this case is the number of topic-tag relations, rather than the number of unique tags.

So that could be 100,000 tags, with 30 topics attached to each tag. That’s still very large, but not quite as unbelievable as 3 million unique tags.

5 Likes

Well I guess we are going to need a table to track counts of tag per category, then we can cheaply figure this out.

2 Likes

@david , you’re right. It’s 20 unique tags. And 1~5 tags per topic.
@sam, please refer to Tags - Stack Overflow. You will see the real world data could be way more than 3 million records in topic_tags table.

1 Like

I know that page, I remember hacking on that

7 Likes

Great to hear :slight_smile:

I committed a fix for this, so please give it a try.

5 Likes

Hi Neil. I try the latest bits. But I don’t find the tag dropdown after turning on show filter by tag.
But /tags page looks good.

1 Like

Do you see any errors in your logs that mention “category_tag_stats”? There’s a job that runs which will initialize the stats needed by the dropdown. It might take a while to run on your database, but should have completed.

1 Like

Yes, I can see it now. It will take a while since my forum has over 5M topics.

1 Like

Hi All,

Is there a way to sort this king of page by the title ?

Many thanks for your help