Eseguo gli spec in Discourse e rilevo alcune query inefficienti nel log delle query, come segue:
- Un DISTINCT non necessario rende la query lenta site_settings_controller.rb#L165, come mostrato di seguito:
SELECT
DISTINCT users.id
FROM
"users" CROSS
JOIN categories c
LEFT JOIN category_users cu ON users.id = cu.user_id
AND c.id = cu.category_id
WHERE
(
c.id = '3613'
AND cu.notification_level IS NULL
)
Quando categories.id e notification_level hanno valori specifici, grazie all’indice UNIQUE (category_id, user_id) in category_users e alla PRIMARY KEY (id) in categories, né il CROSS JOIN né il LEFT JOIN generano record duplicati. Questo significa che possiamo rimuovere DISTINCT per accelerare la query, come mostrato di seguito:
Questa query ottimizzata impiega 4.532.166 nanosecondi (miglioramento del 30%).
- Un DISTINCT non necessario in una subquery rende la query lenta search.rb#L523:
SELECT
"posts".*
FROM
"posts"
JOIN (
SELECT
*,
row_number() over() row_number
FROM
(
SELECT
topics.id,
min(posts.post_number) post_number
FROM
"posts"
INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id"
INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id"
AND ("topics"."deleted_at" IS NULL)
LEFT JOIN categories ON categories.id = topics.category_id
WHERE
("posts"."deleted_at" IS NULL)
AND "posts"."post_type" IN (1, 2, 3)
AND (topics.visible)
AND (
topics.archetype <> 'private_message'
)
AND (
topics.id IN (
SELECT
DISTINCT(tt.topic_id)
FROM
topic_tags tt
WHERE
tt.tag_id in (
SELECT
tag_id
FROM
tag_group_memberships
WHERE
tag_group_id = 504
)
)
)
AND (
categories.id NOT IN (
SELECT
categories.id
WHERE
categories.search_priority = 1
)
)
AND (
(categories.id IS NULL)
OR (NOT categories.read_restricted)
)
GROUP BY
topics.id
ORDER BY
MAX(posts.created_at) DESC
LIMIT
6 OFFSET 0
) xxx
) x ON x.id = posts.topic_id
AND x.post_number = posts.post_number
WHERE
("posts"."deleted_at" IS NULL)
ORDER BY
row_number
DISTINCT(tt.topic_id) è ridondante e possiamo rimuoverlo per accelerare la query, come mostrato di seguito:
Questo migliora le prestazioni della query da 12.655.768 a 5.005.154 nanosecondi (miglioramento del 60%).
- Un DISTINCT non necessario in una subquery rende la query lenta [search.rb#L642]:
SELECT
"posts".*
FROM
"posts"
JOIN (
SELECT
*,
row_number() over() row_number
FROM
(
SELECT
topics.id,
posts.post_number
FROM
"posts"
INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id"
INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id"
AND ("topics"."deleted_at" IS NULL)
LEFT JOIN categories ON categories.id = topics.category_id
WHERE
("posts"."deleted_at" IS NULL)
AND "posts"."post_type" IN (1, 2, 3)
AND (topics.visible)
AND (
topics.archetype <> 'private_message'
)
AND (
topics.category_id IN (3715)
)
AND (
topics.id IN (
SELECT
DISTINCT(tt.topic_id)
FROM
topic_tags tt,
tags
WHERE
tt.tag_id = tags.id
AND lower(tags.name) IN ('lunch')
)
)
AND (
(categories.id IS NULL)
OR (NOT categories.read_restricted)
)
ORDER BY
posts.like_count DESC
LIMIT
6 OFFSET 0
) xxx
) x ON x.id = posts.topic_id
AND x.post_number = posts.post_number
WHERE
("posts"."deleted_at" IS NULL)
ORDER BY
row_number
Simile al caso precedente (codice sorgente diverso), DISTINCT(tt.topic_id) è ridondante e possiamo rimuoverlo per accelerare la query:
Questo migliora le prestazioni della query da 23.659.762 a 21.030.593 nanosecondi (miglioramento del 10%).