Recherche de requêtes lentes dans le journal de la base de données

J’exécute des tests dans Discourse et j’ai détecté des requêtes inefficaces dans le journal des requêtes, comme suit :

  1. Un DISTINCT inutile ralentit la requête site_settings_controller.rb#L165 :
SELECT 
  DISTINCT users.id 
FROM 
  "users" CROSS 
  JOIN categories c 
  LEFT JOIN category_users cu ON users.id = cu.user_id 
  AND c.id = cu.category_id 
WHERE 
  (
    c.id = '3613'
    AND cu.notification_level IS NULL
  )

Lorsque categories.id et notification_level sont spécifiés avec des valeurs, la contrainte UNIQUE (category_id, user_id) dans categories_users et la PRIMARY KEY (id) dans categories garantissent que ni le CROSS JOIN ni le LEFT JOIN ne créent de doublons. Cela signifie que nous pouvons supprimer DISTINCT pour accélérer la requête, comme illustré ci-dessous :
Cette requête optimisée s’exécute en 4 532 166 nanosecondes (amélioration de 30 %).

  1. Un DISTINCT inutile dans une sous-requête ralentit la requête search.rb#L523 :
SELECT 
  "posts".* 
FROM 
  "posts" 
  JOIN (
    SELECT 
      *, 
      row_number() over() row_number 
    FROM 
      (
        SELECT 
          topics.id, 
          min(posts.post_number) post_number 
        FROM 
          "posts" 
          INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" 
          INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" 
          AND ("topics"."deleted_at" IS NULL) 
          LEFT JOIN categories ON categories.id = topics.category_id 
        WHERE 
          ("posts"."deleted_at" IS NULL) 
          AND "posts"."post_type" IN (1, 2, 3) 
          AND (topics.visible) 
          AND (
            topics.archetype <> 'private_message'
          ) 
          AND (
            topics.id IN (
              SELECT 
                DISTINCT(tt.topic_id) 
              FROM 
                topic_tags tt 
              WHERE 
                tt.tag_id in (
                  SELECT 
                    tag_id 
                  FROM 
                    tag_group_memberships 
                  WHERE 
                    tag_group_id = 504
                )
            )
          ) 
          AND (
            categories.id NOT IN (
              SELECT 
                categories.id 
              WHERE 
                categories.search_priority = 1
            )
          ) 
          AND (
            (categories.id IS NULL) 
            OR (NOT categories.read_restricted)
          ) 
        GROUP BY 
          topics.id 
        ORDER BY 
          MAX(posts.created_at) DESC 
        LIMIT 
          6 OFFSET 0
      ) xxx
  ) x ON x.id = posts.topic_id 
  AND x.post_number = posts.post_number 
WHERE 
  ("posts"."deleted_at" IS NULL) 
ORDER BY 
  row_number

DISTINCT(tt.topic_id) est redondant et peut être supprimé pour accélérer la requête, comme illustré ci-dessous :
Cela permet d’améliorer les performances de cette requête de 12 655 768 à 5 005 154 nanosecondes (amélioration de 60 %).

  1. Un DISTINCT inutile dans une sous-requête ralentit la requête [search.rb#L642] :
SELECT 
  "posts".* 
FROM 
  "posts" 
  JOIN (
    SELECT 
      *, 
      row_number() over() row_number 
    FROM 
      (
        SELECT 
          topics.id, 
          posts.post_number 
        FROM 
          "posts" 
          INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" 
          INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" 
          AND ("topics"."deleted_at" IS NULL) 
          LEFT JOIN categories ON categories.id = topics.category_id 
        WHERE 
          ("posts"."deleted_at" IS NULL) 
          AND "posts"."post_type" IN (1, 2, 3) 
          AND (topics.visible) 
          AND (
            topics.archetype <> 'private_message'
          ) 
          AND (
            topics.category_id IN (3715)
          ) 
          AND (
            topics.id IN (
              SELECT 
                DISTINCT(tt.topic_id) 
              FROM 
                topic_tags tt, 
                tags 
              WHERE 
                tt.tag_id = tags.id 
                AND lower(tags.name) IN ('lunch')
            )
          ) 
          AND (
            (categories.id IS NULL) 
            OR (NOT categories.read_restricted)
          ) 
        ORDER BY 
          posts.like_count DESC 
        LIMIT 
          6 OFFSET 0
      ) xxx
  ) x ON x.id = posts.topic_id 
  AND x.post_number = posts.post_number 
WHERE 
  ("posts"."deleted_at" IS NULL) 
ORDER BY 
  row_number

Similaire au cas précédent (code source différent), DISTINCT(tt.topic_id) est redondant et peut être supprimé pour accélérer la requête :
Cela permet d’améliorer les performances de cette requête de 23 659 762 à 21 030 593 nanosecondes (amélioration de 10 %).