From Database Log we find some queries with unnecessary ORDER BY in subquery produced by user.rb#L379.One of them is as follows:
SELECT
"group_users"."group_id"
FROM
"group_users"
WHERE
"group_users"."group_id" IN (
SELECT
"groups"."id"
FROM
"groups"
WHERE
(groups.id > 0)
ORDER BY
name ASC
)
AND "group_users"."user_id" = 762
ORDER BY in subquery is meaningless and we can remove it to accelerate query. According to our test it can improve performance of query by 18%-90%(depend on the time wateing on sort in subquery)
We run specs in discourse, and from Database we find a slow query((6480259 nanosecond) compared with similar query) with unnecessary DISTINCT produced by site_settings_controller.rb#L141as follows:
SELECT
DISTINCT users.id
FROM
"users" CROSS
JOIN tags t
LEFT JOIN tag_users tu ON users.id = tu.user_id
AND t.id = tu.tag_id
WHERE
(
t.id IN (1825)
AND tu.notification_level IS NULL
)
When tags.id and notification_level are specify value, because of UNIQUE (tag_id, user_id, notification_level) in tag_users and PRIMARY KEY(id) in tags, both cross join and left join will not create duplicate
records which means that we can remove DISTINCT to accelerate query.
This opt query take 4538891 nanosecond(improve 30%)
From Database Log we find a slow query(6064379 NanoSecond) with many subquery and union produced by group.rb#L112 as follows:
SELECT
"groups"."id",
"groups"."name"
FROM
"groups"
INNER JOIN "group_users" ON "groups"."id" = "group_users"."group_id"
WHERE
"group_users"."user_id" = 296
AND (groups.id > 0)
AND (
groups.id IN (
SELECT
id
FROM
groups
WHERE
visibility_level = 0
UNION ALL
SELECT
id
FROM
groups
WHERE
visibility_level = 1
AND 296 IS NOT NULL
UNION ALL
SELECT
g.id
FROM
groups g
JOIN group_users gu ON gu.group_id = g.id
AND gu.user_id = 296
WHERE
g.visibility_level = 2
UNION ALL
SELECT
g.id
FROM
groups g
LEFT JOIN group_users gu ON gu.group_id = g.id
AND gu.user_id = 296
AND gu.owner
WHERE
g.visibility_level = 3
AND (
gu.id IS NOT NULL
OR FALSE
)
UNION ALL
SELECT
g.id
FROM
groups g
JOIN group_users gu ON gu.group_id = g.id
AND gu.user_id = 296
AND gu.owner
WHERE
g.visibility_level = 4
)
)
ORDER BY
name ASC
Obviously it is equal to query as below:
SELECT
"groups"."id",
"groups"."name"
FROM
"groups"
INNER JOIN "group_users" ON "groups"."id" = "group_users"."group_id"
WHERE
"group_users"."user_id" = 296 AND
groups.visibility_level IN (0, 1, 2, 3, 4)
ORDER BY
name ASC
This opt query takes 378062 Nanosecond(improve 93%).
SELECT
"groups"."id",
"groups"."name"
FROM
"groups"
INNER JOIN "group_users" gu ON "groups"."id" = gu."group_id"
WHERE
"group_users"."user_id" = 296
AND (groups.id > 0)
AND (
groups.visibility_level IN (0, 1, 2)
OR (groups.visibility_level IN (3,4) and gu.owner)
)
ORDER BY
name ASC
subquery with predicate visibility_level = 0,1,2 can be replaced by groups.visibility_level IN (0, 1, 2) , and subquery with predicate visibility_level = 3,4 can be replaced by or (groups.visibility_level IN (3,4) and gu.owner)
From Database we find a slow query(1141257 Nanosecond) with unnecessary JOIN and DISTINCT produced by directory_items_controller.rb as follows
SELECT
COUNT(DISTINCT "directory_items"."id")
FROM
"directory_items"
LEFT OUTER JOIN "users" ON "users"."id" = "directory_items"."user_id"
LEFT OUTER JOIN "group_users" ON "group_users"."user_id" = "users"."id"
LEFT OUTER JOIN "groups" ON "groups"."id" = "group_users"."group_id"
LEFT OUTER JOIN "user_stats" ON "user_stats"."user_id" = "directory_items"."user_id"
WHERE
"directory_items"."period_type" = 1
AND "groups"."id" = 2898
It retrieved nothing from user_stats and groups.id = 2898 can be replcaed by group_users.group_id = 2898 which means we can remove table groups and user_stats from query. At the same time, because of UNIQUE (user_id, group_id) in group_users and PRIMARY KEY (id) in users, JOIN won’t create duplicate records, so we can also remove DISTINCT from query as shown below:
SELECT
COUNT("directory_items"."id")
FROM
"directory_items"
LEFT OUTER JOIN "users" ON "users"."id" = "directory_items"."user_id"
LEFT OUTER JOIN "group_users" ON "group_users"."user_id" = "users"."id"
WHERE
"directory_items"."period_type" = 1
AND "group_users"."group_id" = 2898
This opt query take 941700 Nanosecond(improve 17.49%)
We run the specs in discourse, and from Database we find a slow query((331729 nanosecond)compared with similar query) with unnecessary JOIN produced by ser_badge.rb#L18as follows:
SELECT
COUNT(*)
FROM
"badges"
INNER JOIN "user_badges" ON "badges"."id" = "user_badges"."badge_id"
WHERE
"user_badges"."user_id" = 2112
AND (
user_badges.badge_id IN (
SELECT
id
FROM
badges
WHERE
enabled
)
)
AND "badges"."id" = 1
This subquery can be replaced by a simple prediacte badges.enabled as shown below:
SELECT
COUNT(*)
FROM
"badges"
INNER JOIN "user_badges" ON "badges"."id" = "user_badges"."badge_id"
WHERE
"user_badges"."user_id" = 2112
AND badges.enabled
AND "badges"."id" = 1
This opt query take 267383 nanosecond(improve 19%)