I just merged another try to improve performance of this job:
discourse:main
← lis2:improve_post_alerter_job
opened 03:25AM - 07 Jul 23 UTC
Simplified query based on SiteSettings to join only relevant user_options rows. …
In addition, index was added to 'watched_precedence_over_muted` column in `user_options` table to speed up query
Original
```
SELECT "users".* FROM "users" LEFT JOIN user_options ON user_options.user_id = users.id WHERE (id IN (SELECT user_id FROM category_users WHERE category_id = 231 AND notification_level = 0
UNION
SELECT user_id FROM tag_users tu JOIN topic_tags tt ON tt.tag_id = tu.tag_id AND tt.topic_id = 18799 AND tu.notification_level = 0
EXCEPT
SELECT user_id FROM topic_users tus WHERE tus.topic_id = 18799 AND tus.notification_level = 3
)) AND (user_options.watched_precedence_over_muted IS false OR (user_options.watched_precedence_over_muted IS NULL AND true))
Nested Loop Left Join (cost=42034.83..117264.12 rows=21226 width=356) (actual time=39.473..39.550 rows=1 loops=1)
Filter: ((user_options.watched_precedence_over_muted IS FALSE) OR (user_options.watched_precedence_over_muted IS NULL))
-> Nested Loop (cost=42034.40..95792.76 rows=42240 width=356) (actual time=39.455..39.531 rows=1 loops=1)
-> Subquery Scan on "ANY_subquery" (cost=42033.97..43307.96 rows=42240 width=4) (actual time=39.435..39.511 rows=1 loops=1)
-> HashSetOp Except (cost=42033.97..42885.56 rows=42240 width=8) (actual time=39.433..39.508 rows=1 loops=1)
-> Append (cost=42033.97..42779.90 rows=42264 width=8) (actual time=39.155..39.278 rows=2 loops=1)
-> HashAggregate (cost=42033.97..42456.37 rows=42240 width=8) (actual time=39.154..39.238 rows=1 loops=1)
Group Key: category_users.user_id
Batches: 1 Memory Usage: 1561kB
-> Append (cost=0.44..41928.37 rows=42240 width=4) (actual time=38.968..38.996 rows=1 loops=1)
-> Index Scan using idx_category_users_category_id_user_id on category_users (cost=0.44..41290.25 rows=42239 width=4) (actual time=38.967..38.969 rows=1 loops=1)
Index Cond: (category_id = 231)
Filter: (notification_level = 0)
Rows Removed by Filter: 2
-> Nested Loop (cost=0.15..4.52 rows=1 width=4) (actual time=0.025..0.025 rows=0 loops=1)
-> Seq Scan on topic_tags tt (cost=0.00..1.04 rows=2 width=4) (actual time=0.007..0.007 rows=2 loops=1)
Filter: (topic_id = 18799)
Rows Removed by Filter: 1
-> Index Only Scan using idx_tag_users_ix2 on tag_users tu (cost=0.15..1.73 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=2)
Index Cond: ((tag_id = tt.tag_id) AND (notification_level = 0))
Heap Fetches: 0
-> Subquery Scan on "*SELECT* 3" (cost=0.43..112.21 rows=24 width=8) (actual time=0.023..0.038 rows=1 loops=1)
-> Index Scan using index_topic_users_on_topic_id_and_user_id on topic_users tus (cost=0.43..111.97 rows=24 width=4) (actual time=0.022..0.036 rows=1 loops=1)
Index Cond: (topic_id = 18799)
Filter: (notification_level = 3)
Rows Removed by Filter: 13
-> Index Scan using users_pkey on users (cost=0.43..1.24 rows=1 width=356) (actual time=0.014..0.014 rows=1 loops=1)
Index Cond: (id = "ANY_subquery".user_id)
-> Index Scan using index_user_options_on_user_id_and_default_calendar on user_options (cost=0.43..0.50 rows=1 width=5) (actual time=0.013..0.013 rows=1 loops=1)
Index Cond: (user_id = users.id)
Planning Time: 0.869 ms
Execution Time: 46.053 ms
```
Tweaked
```
EXPLAIN ANALYZE SELECT "users".* FROM "users" WHERE (id IN (SELECT uo.user_id FROM user_options uo
LEFT JOIN topic_users tus ON tus.user_id = uo.user_id AND tus.topic_id = 18799
LEFT JOIN category_users cu ON cu.user_id = uo.user_id AND cu.category_id = 231
LEFT JOIN tag_users tu ON tu.user_id = uo.user_id
JOIN topic_tags tt ON tt.tag_id = tu.tag_id AND tt.topic_id = 18799
WHERE
(tus.id IS NULL OR tus.notification_level != 3)
AND (cu.notification_level = 0
OR
tu.notification_level = 0)
AND (uo.watched_precedence_over_muted IS NULL OR uo.watched_precedence_over_muted IS false)
));
Nested Loop (cost=56.73..59.06 rows=5 width=356) (actual time=0.160..0.161 rows=0 loops=1)
-> HashAggregate (cost=56.30..56.35 rows=5 width=8) (actual time=0.160..0.161 rows=0 loops=1)
Group Key: uo.user_id
Batches: 1 Memory Usage: 24kB
-> Nested Loop Left Join (cost=1.45..56.29 rows=5 width=8) (actual time=0.157..0.158 rows=0 loops=1)
Filter: ((tus.id IS NULL) OR (tus.notification_level <> 3))
-> Nested Loop Left Join (cost=1.01..50.45 rows=6 width=8) (actual time=0.157..0.158 rows=0 loops=1)
Filter: ((cu.notification_level = 0) OR (tu.notification_level = 0))
-> Nested Loop (cost=0.58..44.40 rows=7 width=12) (actual time=0.157..0.157 rows=0 loops=1)
-> Nested Loop (cost=0.15..11.73 rows=14 width=8) (actual time=0.157..0.157 rows=0 loops=1)
-> Seq Scan on topic_tags tt (cost=0.00..1.04 rows=2 width=4) (actual time=0.147..0.147 rows=2 loops=1)
Filter: (topic_id = 18799)
Rows Removed by Filter: 1
-> Index Only Scan using idx_tag_users_ix2 on tag_users tu (cost=0.15..5.28 rows=7 width=12) (actual time=0.003..0.003 rows=0 loops=2)
Index Cond: (tag_id = tt.tag_id)
Heap Fetches: 0
-> Index Scan using index_user_options_on_user_id_and_default_calendar on user_options uo (cost=0.43..2.33 rows=1 width=4) (never executed)
Index Cond: (user_id = tu.user_id)
Filter: ((watched_precedence_over_muted IS NULL) OR (watched_precedence_over_muted IS FALSE))
-> Index Scan using idx_category_users_user_id_category_id on category_users cu (cost=0.43..0.85 rows=1 width=8) (never executed)
Index Cond: ((user_id = uo.user_id) AND (category_id = 231))
-> Index Scan using index_topic_users_on_topic_id_and_user_id on topic_users tus (cost=0.43..0.96 rows=1 width=12) (never executed)
Index Cond: ((topic_id = 18799) AND (user_id = uo.user_id))
-> Index Scan using users_pkey on users (cost=0.43..0.54 rows=1 width=356) (never executed)
Index Cond: (id = uo.user_id)
Planning Time: 3.636 ms
Execution Time: 0.253 ms
```
I tested in on few instances and it was fine, but they were smaller than yours.
If it is still failing with this commit, could you provide me EXPLAIN ANALYSE
report? Script to generate it:
topic = Topic.last
user_option_sql_fragment =
if SiteSetting.watched_precedence_over_muted
<<~SQL
INTERSECT
SELECT user_id FROM user_options WHERE user_options.watched_precedence_over_muted IS false
SQL
else
<<~SQL
EXCEPT
SELECT user_id FROM user_options WHERE user_options.watched_precedence_over_muted IS true
SQL
end
user_ids_sql = <<~SQL
(
SELECT user_id FROM category_users WHERE category_id = #{topic.category_id.to_i} AND notification_level = #{CategoryUser.notification_levels[:muted]}
UNION
SELECT user_id FROM tag_users tu JOIN topic_tags tt ON tt.tag_id = tu.tag_id AND tt.topic_id = #{topic.id} WHERE tu.notification_level = #{TagUser.notification_levels[:muted]}
EXCEPT
SELECT user_id FROM topic_users tus WHERE tus.topic_id = #{topic.id} AND tus.notification_level = #{TopicUser.notification_levels[:watching]}
)
#{user_option_sql_fragment}
SQL
sql = User.where("id IN (#{user_ids_sql})").to_sql
sql_with_index = <<SQL
EXPLAIN ANALYZE #{sql};
SQL
result = ActiveRecord::Base.connection.execute("#{sql_with_index}")
puts sql_with_index
result.each do |r|
puts r.values
end
It would help me to find missing index or which part of this query is so slow.
3 Likes