我在 Discourse 中运行测试,并在查询日志中发现了一些低效的查询,具体如下:
- 不必要的 DISTINCT 导致查询变慢 site_settings_controller.rb#L165,示例如下:
SELECT
DISTINCT users.id
FROM
"users" CROSS
JOIN categories c
LEFT JOIN category_users cu ON users.id = cu.user_id
AND c.id = cu.category_id
WHERE
(
c.id = '3613'
AND cu.notification_level IS NULL
)
当 categories.id 和 notification_level 指定具体值时,由于 categories_users 表存在 UNIQUE (category_id, user_id) 约束,且 categories 表存在 PRIMARY KEY(id) 约束,CROSS JOIN 和 LEFT JOIN 都不会产生重复记录。因此,我们可以移除 DISTINCT 以加速查询,优化后的查询如下:
优化后的查询耗时为 4532166 纳秒(性能提升 30%)。
- 子查询中不必要的 DISTINCT 导致查询变慢 search.rb#L523:
SELECT
"posts".*
FROM
"posts"
JOIN (
SELECT
*,
row_number() over() row_number
FROM
(
SELECT
topics.id,
min(posts.post_number) post_number
FROM
"posts"
INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id"
INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id"
AND ("topics"."deleted_at" IS NULL)
LEFT JOIN categories ON categories.id = topics.category_id
WHERE
("posts"."deleted_at" IS NULL)
AND "posts"."post_type" IN (1, 2, 3)
AND (topics.visible)
AND (
topics.archetype <> 'private_message'
)
AND (
topics.id IN (
SELECT
DISTINCT(tt.topic_id)
FROM
topic_tags tt
WHERE
tt.tag_id in (
SELECT
tag_id
FROM
tag_group_memberships
WHERE
tag_group_id = 504
)
)
)
AND (
categories.id NOT IN (
SELECT
categories.id
WHERE
categories.search_priority = 1
)
)
AND (
(categories.id IS NULL)
OR (NOT categories.read_restricted)
)
GROUP BY
topics.id
ORDER BY
MAX(posts.created_at) DESC
LIMIT
6 OFFSET 0
) xxx
) x ON x.id = posts.topic_id
AND x.post_number = posts.post_number
WHERE
("posts"."deleted_at" IS NULL)
ORDER BY
row_number
DISTINCT(tt.topic_id) 是冗余的,我们可以将其移除以加速查询,优化后的查询如下:
该优化可将查询性能从 12655768 纳秒提升至 5005154 纳秒(性能提升 60%)。
- 子查询中不必要的 DISTINCT 导致查询变慢 [search.rb#L642]:
SELECT
"posts".*
FROM
"posts"
JOIN (
SELECT
*,
row_number() over() row_number
FROM
(
SELECT
topics.id,
posts.post_number
FROM
"posts"
INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id"
INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id"
AND ("topics"."deleted_at" IS NULL)
LEFT JOIN categories ON categories.id = topics.category_id
WHERE
("posts"."deleted_at" IS NULL)
AND "posts"."post_type" IN (1, 2, 3)
AND (topics.visible)
AND (
topics.archetype <> 'private_message'
)
AND (
topics.category_id IN (3715)
)
AND (
topics.id IN (
SELECT
DISTINCT(tt.topic_id)
FROM
topic_tags tt,
tags
WHERE
tt.tag_id = tags.id
AND lower(tags.name) IN ('lunch')
)
)
AND (
(categories.id IS NULL)
OR (NOT categories.read_restricted)
)
ORDER BY
posts.like_count DESC
LIMIT
6 OFFSET 0
) xxx
) x ON x.id = posts.topic_id
AND x.post_number = posts.post_number
WHERE
("posts"."deleted_at" IS NULL)
ORDER BY
row_number
与上一个案例类似(源代码不同),DISTINCT(tt.topic_id) 是冗余的,我们可以将其移除以加速查询:
该优化可将查询性能从 23659762 纳秒提升至 21030593 纳秒(性能提升 10%)。