Rilevate query lente nel log del database

Eseguo gli spec in Discourse e rilevo alcune query inefficienti nel log delle query, come segue:

  1. Un DISTINCT non necessario rende la query lenta site_settings_controller.rb#L165, come mostrato di seguito:
SELECT 
  DISTINCT users.id 
FROM 
  "users" CROSS 
  JOIN categories c 
  LEFT JOIN category_users cu ON users.id = cu.user_id 
  AND c.id = cu.category_id 
WHERE 
  (
    c.id = '3613'
    AND cu.notification_level IS NULL
  )

Quando categories.id e notification_level hanno valori specifici, grazie all’indice UNIQUE (category_id, user_id) in category_users e alla PRIMARY KEY (id) in categories, né il CROSS JOIN né il LEFT JOIN generano record duplicati. Questo significa che possiamo rimuovere DISTINCT per accelerare la query, come mostrato di seguito:

Questa query ottimizzata impiega 4.532.166 nanosecondi (miglioramento del 30%).

  1. Un DISTINCT non necessario in una subquery rende la query lenta search.rb#L523:
SELECT 
  "posts".* 
FROM 
  "posts" 
  JOIN (
    SELECT 
      *, 
      row_number() over() row_number 
    FROM 
      (
        SELECT 
          topics.id, 
          min(posts.post_number) post_number 
        FROM 
          "posts" 
          INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" 
          INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" 
          AND ("topics"."deleted_at" IS NULL) 
          LEFT JOIN categories ON categories.id = topics.category_id 
        WHERE 
          ("posts"."deleted_at" IS NULL) 
          AND "posts"."post_type" IN (1, 2, 3) 
          AND (topics.visible) 
          AND (
            topics.archetype <> 'private_message'
          ) 
          AND (
            topics.id IN (
              SELECT 
                DISTINCT(tt.topic_id) 
              FROM 
                topic_tags tt 
              WHERE 
                tt.tag_id in (
                  SELECT 
                    tag_id 
                  FROM 
                    tag_group_memberships 
                  WHERE 
                    tag_group_id = 504
                )
            )
          ) 
          AND (
            categories.id NOT IN (
              SELECT 
                categories.id 
              WHERE 
                categories.search_priority = 1
            )
          ) 
          AND (
            (categories.id IS NULL) 
            OR (NOT categories.read_restricted)
          ) 
        GROUP BY 
          topics.id 
        ORDER BY 
          MAX(posts.created_at) DESC 
        LIMIT 
          6 OFFSET 0
      ) xxx
  ) x ON x.id = posts.topic_id 
  AND x.post_number = posts.post_number 
WHERE 
  ("posts"."deleted_at" IS NULL) 
ORDER BY 
  row_number

DISTINCT(tt.topic_id) è ridondante e possiamo rimuoverlo per accelerare la query, come mostrato di seguito:

Questo migliora le prestazioni della query da 12.655.768 a 5.005.154 nanosecondi (miglioramento del 60%).

  1. Un DISTINCT non necessario in una subquery rende la query lenta [search.rb#L642]:
SELECT 
  "posts".* 
FROM 
  "posts" 
  JOIN (
    SELECT 
      *, 
      row_number() over() row_number 
    FROM 
      (
        SELECT 
          topics.id, 
          posts.post_number 
        FROM 
          "posts" 
          INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" 
          INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" 
          AND ("topics"."deleted_at" IS NULL) 
          LEFT JOIN categories ON categories.id = topics.category_id 
        WHERE 
          ("posts"."deleted_at" IS NULL) 
          AND "posts"."post_type" IN (1, 2, 3) 
          AND (topics.visible) 
          AND (
            topics.archetype <> 'private_message'
          ) 
          AND (
            topics.category_id IN (3715)
          ) 
          AND (
            topics.id IN (
              SELECT 
                DISTINCT(tt.topic_id) 
              FROM 
                topic_tags tt, 
                tags 
              WHERE 
                tt.tag_id = tags.id 
                AND lower(tags.name) IN ('lunch')
            )
          ) 
          AND (
            (categories.id IS NULL) 
            OR (NOT categories.read_restricted)
          ) 
        ORDER BY 
          posts.like_count DESC 
        LIMIT 
          6 OFFSET 0
      ) xxx
  ) x ON x.id = posts.topic_id 
  AND x.post_number = posts.post_number 
WHERE 
  ("posts"."deleted_at" IS NULL) 
ORDER BY 
  row_number

Simile al caso precedente (codice sorgente diverso), DISTINCT(tt.topic_id) è ridondante e possiamo rimuoverlo per accelerare la query:

Questo migliora le prestazioni della query da 23.659.762 a 21.030.593 nanosecondi (miglioramento del 10%).