Chargements de profil lents avec une base de données de plus de 100 Go

Il existe quelques autres sujets sur le réglage des grandes instances. Les principes d’exécution d’une grande base de données PostgreSQL sont les mêmes que pour MySQL, mais le diable se cache dans les détails.

Déjà en train de chercher sur le sujet sur Internet et ici sur Meta. Auriez-vous des sujets à nous recommander qui pourraient être utiles ?

Votre fichier app.yml actuel contient ces lignes commentées :

Décommentez ces lignes et augmentez les valeurs en conséquence. Ensuite, vous devez reconstruire.

3 « J'aime »

J’ai fait cela et reconstruit le conteneur. J’essaie de relancer la réindexation, mais il semble qu’elle soit à nouveau bloquée. Voici ce que j’observe pour ces deux éléments :

Y a-t-il un moyen de savoir si work_mem doit être augmenté ? Je vois que le disque est sollicité assez intensément, donc je pense qu’il pourrait s’agir d’un tri au niveau du système de fichiers ou similaire, mais je ne suis pas sûr de savoir confirmer si c’est le cas ou si autre chose se produit.

1 « J'aime »

Utilisez les requêtes que vous avez listées dans le message d’origine. Exécutez-les dans un shell psql en les préfixant par EXPLAIN ANALYZE et collez la sortie ici.

2 « J'aime »

Looks like there’s some caching at work so subsequent loads tend to be faster, but > 5 seconds is still brutal when the cache expires.

                                                                                             QUERY PLAN                          
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=26396.41..26397.10 rows=6 width=1278) (actual time=90.879..114.499 rows=6 loops=1)
   ->  Gather Merge  (cost=26396.41..26448.85 rows=456 width=1278) (actual time=90.877..114.496 rows=6 loops=1)
         Workers Planned: 1
         Workers Launched: 1
         ->  Sort  (cost=25396.40..25397.54 rows=456 width=1278) (actual time=87.195..87.197 rows=5 loops=2)
               Sort Key: posts.like_count DESC, posts.created_at DESC
               Sort Method: top-N heapsort  Memory: 38kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 39kB
               ->  Nested Loop  (cost=103.30..25388.23 rows=456 width=1278) (actual time=3.200..83.497 rows=3610 loops=2)
                     ->  Parallel Bitmap Heap Scan on posts  (cost=99.06..9509.95 rows=1356 width=783) (actual time=3.075..56.880 rows=5932 loops=2)
                           Recheck Cond: (user_id = 27510)
                           Filter: ((deleted_at IS NULL) AND (post_number > 1) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
                           Rows Removed by Filter: 1071
                           Heap Blocks: exact=6272
                           ->  Bitmap Index Scan on index_posts_on_user_id_and_created_at  (cost=0.00..98.48 rows=2389 width=0) (actual time=3.916..3.917 rows=20157 loops=1)
                                 Index Cond: (user_id = 27510)
                     ->  Index Scan using topics_pkey on topics  (cost=4.24..11.71 rows=1 width=495) (actual time=0.004..0.004 rows=1 loops=11864)
                           Index Cond: (id = posts.topic_id)
                           Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                           Rows Removed by Filter: 0
                           SubPlan 1
                             ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=0.013..0.029 rows=35 loops=2)
                                   Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 2.535 ms
 Execution Time: 114.657 ms
(25 rows)
                                                                                          QUERY PLAN                             
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=25004.87..25004.87 rows=1 width=1464) (actual time=98.136..121.987 rows=6 loops=1)
   ->  Sort  (cost=25004.87..25004.87 rows=1 width=1464) (actual time=98.134..121.984 rows=6 loops=1)
         Sort Key: topic_links.clicks DESC, topic_links.created_at DESC
         Sort Method: top-N heapsort  Memory: 55kB
         ->  Nested Loop  (cost=1103.75..25004.86 rows=1 width=1464) (actual time=6.763..118.114 rows=3443 loops=1)
               ->  Gather  (cost=1099.51..24993.34 rows=1 width=969) (actual time=6.464..88.294 rows=9130 loops=1)
                     Workers Planned: 1
                     Workers Launched: 1
                     ->  Nested Loop  (cost=99.51..23993.24 rows=1 width=969) (actual time=3.151..73.939 rows=4565 loops=2)
                           ->  Parallel Bitmap Heap Scan on posts  (cost=99.08..9506.46 rows=1405 width=783) (actual time=3.032..43.325 rows=7003 loops=2)
                                 Recheck Cond: (user_id = 27510)
                                 Filter: ((deleted_at IS NULL) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
                                 Heap Blocks: exact=5953
                                 ->  Bitmap Index Scan on index_posts_on_user_id_and_created_at  (cost=0.00..98.48 rows=2389 width=0) (actual time=3.790..3.791 rows=20157 loops=1)
                                       Index Cond: (user_id = 27510)
                           ->  Index Scan using index_topic_links_on_post_id on topic_links  (cost=0.43..10.30 rows=1 width=186) (actual time=0.003..0.004 rows=1 loops=14006)
                                 Index Cond: (post_id = posts.id)
                                 Filter: ((NOT internal) AND (NOT reflection) AND (NOT quote) AND (user_id = 27510))
                                 Rows Removed by Filter: 0
               ->  Index Scan using topics_pkey on topics  (cost=4.24..11.52 rows=1 width=495) (actual time=0.003..0.003 rows=0 loops=9130)
                     Index Cond: (id = topic_links.topic_id)
                     Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                     Rows Removed by Filter: 1
                     SubPlan 1
                       ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=0.009..0.022 rows=35 loops=1)
                             Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 1.102 ms
 Execution Time: 122.098 ms
(28 rows)
SELECT "posts"."id" AS t0_r0, "posts"."user_id" AS t0_r1, "posts"."topic_id" AS t0_r2, "posts"."post_number" AS t0_r3, "posts"."raw" AS t0_r4, "posts"."cooked" AS t0_r5, "posts"."created_at" AS t0_r6, "posts"."updated_at" AS t0_r7, "posts"."reply_to_post_number" AS t0_r8, "posts"."reply_count" AS t0_r9, "posts"."quote_count" AS t0_r10, "posts"."deleted_at" AS t0_r11, "posts"."off_topic_count" AS t0_r12, "posts"."like_count" AS t0_r13, "posts"."incoming_link_count" AS t0_r14, "posts"."bookmark_count" AS t0_r15, "posts"."score" AS t0_r16, "posts"."reads" AS t0_r17, "posts"."post_type" AS t0_r18, "posts"."sort_order" AS t0_r19, "posts"."last_editor_id" AS t0_r20, "posts"."hidden" AS t0_r21, "posts"."hidden_reason_id" AS t0_r22, "posts"."notify_moderators_count" AS t0_r23, "posts"."spam_count" AS t0_r24, "posts"."illegal_count" AS t0_r25, "posts"."inappropriate_count" AS t0_r26, "posts"."last_version_at" AS t0_r27, "posts"."user_deleted" AS t0_r28, "posts"."reply_to_user_id" AS t0_r29, "posts"."percent_rank" AS t0_r30, "posts"."notify_user_count" AS t0_r31, "posts"."like_score" AS t0_r32, "posts"."deleted_by_id" AS t0_r33, "posts"."edit_reason" AS t0_r34, "posts"."word_count" AS t0_r35, "posts"."version" AS t0_r36, "posts"."cook_method" AS t0_r37, "posts"."wiki" AS t0_r38, "posts"."baked_at" AS t0_r39, "posts"."baked_version" AS t0_r40, "posts"."hidden_at" AS t0_r41, "posts"."self_edits" AS t0_r42, "posts"."reply_quoted" AS t0_r43, "posts"."via_email" AS t0_r44, "posts"."raw_email" AS t0_r45, "posts"."public_version" AS t0_r46, "posts"."action_code" AS t0_r47, "posts"."locked_by_id" AS t0_r48, "posts"."image_upload_id" AS t0_r49, "topics"."id" AS t1_r0, "topics"."title" AS t1_r1, "topics"."last_posted_at" AS t1_r2, "topics"."created_at" AS t1_r3, "topics"."updated_at" AS t1_r4, "topics"."views" AS t1_r5, "topics"."posts_count" AS t1_r6, "topics"."user_id" AS t1_r7, "topics"."last_post_user_id" AS t1_r8, "topics"."reply_count" AS t1_r9, "topics"."featured_user1_id" AS t1_r10, "topics"."featured_user2_id" AS t1_r11, "topics"."featured_user3_id" AS t1_r12, "topics"."deleted_at" AS t1_r13, "topics"."highest_post_number" AS t1_r14, "topics"."like_count" AS t1_r15, "topics"."incoming_link_count" AS t1_r16, "topics"."category_id" AS t1_r17, "topics"."visible" AS t1_r18, "topics"."moderator_posts_count" AS t1_r19, "topics"."closed" AS t1_r20, "topics"."archived" AS t1_r21, "topics"."bumped_at" AS t1_r22, "topics"."has_summary" AS t1_r23, "topics"."archetype" AS t1_r24, "topics"."featured_user4_id" AS t1_r25, "topics"."notify_moderators_count" AS t1_r26, "topics"."spam_count" AS t1_r27, "topics"."pinned_at" AS t1_r28, "topics"."score" AS t1_r29, "topics"."percent_rank" AS t1_r30, "topics"."subtype" AS t1_r31, "topics"."slug" AS t1_r32, "topics"."deleted_by_id" AS t1_r33, "topics"."participant_count" AS t1_r34, "topics"."word_count" AS t1_r35, "topics"."excerpt" AS t1_r36, "topics"."pinned_globally" AS t1_r37, "topics"."pinned_until" AS t1_r38, "topics"."fancy_title" AS t1_r39, "topics"."highest_staff_post_number" AS t1_r40, "topics"."featured_link" AS t1_r41, "topics"."reviewable_score" AS t1_r42, "topics"."image_upload_id" AS t1_r43, "topics"."slow_mode_seconds" AS t1_r44 FROM "posts" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) WHERE ("posts"."deleted_at" IS NULL) AND (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (3,4,5,17,19,25,26,27,28))) AND "posts"."user_id" = 27510 AND (post_number > 1) ORDER BY posts.like_count DESC, posts.created_at DESC LIMIT 6;
SELECT "topic_links"."id" AS t0_r0, "topic_links"."topic_id" AS t0_r1, "topic_links"."post_id" AS t0_r2, "topic_links"."user_id" AS t0_r3, "topic_links"."url" AS t0_r4, "topic_links"."domain" AS t0_r5, "topic_links"."internal" AS t0_r6, "topic_links"."link_topic_id" AS t0_r7, "topic_links"."created_at" AS t0_r8, "topic_links"."updated_at" AS t0_r9, "topic_links"."reflection" AS t0_r10, "topic_links"."clicks" AS t0_r11, "topic_links"."link_post_id" AS t0_r12, "topic_links"."title" AS t0_r13, "topic_links"."crawled_at" AS t0_r14, "topic_links"."quote" AS t0_r15, "topic_links"."extension" AS t0_r16, "topics"."id" AS t1_r0, "topics"."title" AS t1_r1, "topics"."last_posted_at" AS t1_r2, "topics"."created_at" AS t1_r3, "topics"."updated_at" AS t1_r4, "topics"."views" AS t1_r5, "topics"."posts_count" AS t1_r6, "topics"."user_id" AS t1_r7, "topics"."last_post_user_id" AS t1_r8, "topics"."reply_count" AS t1_r9, "topics"."featured_user1_id" AS t1_r10, "topics"."featured_user2_id" AS t1_r11, "topics"."featured_user3_id" AS t1_r12, "topics"."deleted_at" AS t1_r13, "topics"."highest_post_number" AS t1_r14, "topics"."like_count" AS t1_r15, "topics"."incoming_link_count" AS t1_r16, "topics"."category_id" AS t1_r17, "topics"."visible" AS t1_r18, "topics"."moderator_posts_count" AS t1_r19, "topics"."closed" AS t1_r20, "topics"."archived" AS t1_r21, "topics"."bumped_at" AS t1_r22, "topics"."has_summary" AS t1_r23, "topics"."archetype" AS t1_r24, "topics"."featured_user4_id" AS t1_r25, "topics"."notify_moderators_count" AS t1_r26, "topics"."spam_count" AS t1_r27, "topics"."pinned_at" AS t1_r28, "topics"."score" AS t1_r29, "topics"."percent_rank" AS t1_r30, "topics"."subtype" AS t1_r31, "topics"."slug" AS t1_r32, "topics"."deleted_by_id" AS t1_r33, "topics"."participant_count" AS t1_r34, "topics"."word_count" AS t1_r35, "topics"."excerpt" AS t1_r36, "topics"."pinned_globally" AS t1_r37, "topics"."pinned_until" AS t1_r38, "topics"."fancy_title" AS t1_r39, "topics"."highest_staff_post_number" AS t1_r40, "topics"."featured_link" AS t1_r41, "topics"."reviewable_score" AS t1_r42, "topics"."image_upload_id" AS t1_r43, "topics"."slow_mode_seconds" AS t1_r44, "posts"."id" AS t2_r0, "posts"."user_id" AS t2_r1, "posts"."topic_id" AS t2_r2, "posts"."post_number" AS t2_r3, "posts"."raw" AS t2_r4, "posts"."cooked" AS t2_r5, "posts"."created_at" AS t2_r6, "posts"."updated_at" AS t2_r7, "posts"."reply_to_post_number" AS t2_r8, "posts"."reply_count" AS t2_r9, "posts"."quote_count" AS t2_r10, "posts"."deleted_at" AS t2_r11, "posts"."off_topic_count" AS t2_r12, "posts"."like_count" AS t2_r13, "posts"."incoming_link_count" AS t2_r14, "posts"."bookmark_count" AS t2_r15, "posts"."score" AS t2_r16, "posts"."reads" AS t2_r17, "posts"."post_type" AS t2_r18, "posts"."sort_order" AS t2_r19, "posts"."last_editor_id" AS t2_r20, "posts"."hidden" AS t2_r21, "posts"."hidden_reason_id" AS t2_r22, "posts"."notify_moderators_count" AS t2_r23, "posts"."spam_count" AS t2_r24, "posts"."illegal_count" AS t2_r25, "posts"."inappropriate_count" AS t2_r26, "posts"."last_version_at" AS t2_r27, "posts"."user_deleted" AS t2_r28, "posts"."reply_to_user_id" AS t2_r29, "posts"."percent_rank" AS t2_r30, "posts"."notify_user_count" AS t2_r31, "posts"."like_score" AS t2_r32, "posts"."deleted_by_id" AS t2_r33, "posts"."edit_reason" AS t2_r34, "posts"."word_count" AS t2_r35, "posts"."version" AS t2_r36, "posts"."cook_method" AS t2_r37, "posts"."wiki" AS t2_r38, "posts"."baked_at" AS t2_r39, "posts"."baked_version" AS t2_r40, "posts"."hidden_at" AS t2_r41, "posts"."self_edits" AS t2_r42, "posts"."reply_quoted" AS t2_r43, "posts"."via_email" AS t2_r44, "posts"."raw_email" AS t2_r45, "posts"."public_version" AS t2_r46, "posts"."action_code" AS t2_r47, "posts"."locked_by_id" AS t2_r48, "posts"."image_upload_id" AS t2_r49 FROM "topic_links" INNER JOIN "topics" ON "topics"."id" = "topic_links"."topic_id" AND ("topics"."deleted_at" IS NULL) INNER JOIN "posts" ON "posts"."id" = "topic_links"."post_id" AND ("posts"."deleted_at" IS NULL) WHERE "posts"."user_id" = 27510 AND (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (3,4,5,17,19,25,26,27,28))) AND "topic_links"."user_id" = 27510 AND "topic_links"."internal" = FALSE AND "topic_links"."reflection" = FALSE AND "topic_links"."quote" = FALSE ORDER BY clicks DESC, topic_links.created_at DESC LIMIT 6; 

Les deux utilisent de la mémoire pour des fichiers temporaires, et les temps d’exécution sont suffisamment bons. Peut-être que cela se stabilise ?

Je pense que cette analyse pourrait récupérer des données mises en cache. La page de résumé du site affiche des temps d’exécution beaucoup plus longs :

Je vais essayer de relancer l’analyse plus tard, une fois que le cache aura peut-être expiré.

La réindexation ne devrait être nécessaire qu’une seule fois : une fois les index au format de disque v13, ils restent dans ce format.

Oui, si votre commande VACUUM n’incluait pas ANALYZE, vous ne profitez pas de l’utilisation des index.

2 « J'aime »

D’accord, donc cette requête récupère les publications les plus aimées pour un utilisateur donné.

L’estimation ici était complètement fausse : le planificateur a estimé 2 389 lignes mais en a obtenu 20 157.

Un niveau au-dessus, un autre écart de facteur 5.

Ce sont ce genre de mauvaises estimations qui entraînent une exécution de requête terriblement lente.

2 « J'aime »

En fait, à ma connaissance, ils n’auraient jamais dû être sur un autre format que le v13. Nous avons importé les données d’un autre logiciel de forum vers une installation fraîche de Discourse qui tournait déjà sous la version 13. Pour l’instant, je cherche simplement à m’assurer que tous les index sont correctement reconstruits après l’import.

Ça a du sens. J’essaie de relancer la réindexation après avoir augmenté maintenance_work_mem. Il semble que postmaster utilise désormais plus de mémoire qu’auparavant, mais je ne sais toujours pas si le processus se terminera dans un délai raisonnable.

2 « J'aime »

Il semble qu’un travail de badge soit en cours et bloque la réindexation.

Cela tourne depuis plus de deux heures ? Cela semble étrange.

Cela pourrait aussi être lié à ce travail hebdomadaire qui est toujours en cours :

Vous pouvez exécuter une commande

sv stop unicorn

pour arrêter le serveur web pendant la réindexation de la base de données.

1 « J'aime »

Ok, donc la réindexation a finalement été terminée dans un délai raisonnable. Certains index n’ont pas pu être reconstruits et ont été ignorés, mais aucune précision supplémentaire n’a été fournie sur les raisons. Voici les statistiques des tables à présent :

Le problème de chargement du profil semble s’être aggravé :

Voici les EXPLAIN des requêtes concernées :

                                                                                                QUERY PLAN                               
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=625609.73..625610.43 rows=6 width=1278) (actual time=2058.211..2154.991 rows=6 loops=1)
   ->  Gather Merge  (cost=625609.73..630928.70 rows=45588 width=1278) (actual time=1654.881..1751.658 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=624609.70..624666.69 rows=22794 width=1278) (actual time=1619.464..1619.469 rows=5 loops=3)
               Sort Key: posts.like_count DESC, posts.created_at DESC
               Sort Method: top-N heapsort  Memory: 37kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 37kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 38kB
               ->  Parallel Hash Join  (cost=63000.65..624201.13 rows=22794 width=1278) (actual time=1383.820..1600.913 rows=20578 loops=3)
                     Hash Cond: (posts.topic_id = topics.id)
                     ->  Parallel Bitmap Heap Scan on posts  (cost=1875.97..562899.73 rows=67322 width=783) (actual time=63.310..264.627 rows=20579 loops=3)
                           Recheck Cond: ((user_id = 7237) AND (deleted_at IS NULL))
                           Filter: ((post_number > 1) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
                           Rows Removed by Filter: 39566
                           Heap Blocks: exact=50390
                           ->  Bitmap Index Scan on idx_posts_user_id_deleted_at  (cost=0.00..1835.58 rows=167352 width=0) (actual time=36.507..36.508 rows=180435 loops=1)
                                 Index Cond: (user_id = 7237)
                     ->  Parallel Hash  (cost=59504.20..59504.20 rows=129638 width=495) (actual time=1319.362..1319.364 rows=131785 loops=3)
                           Buckets: 524288  Batches: 1  Memory Usage: 190560kB
                           ->  Parallel Seq Scan on topics  (cost=3.81..59504.20 rows=129638 width=495) (actual time=316.007..1204.917 rows=131785 loops=3)
                                 Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                                 Rows Removed by Filter: 174529
                                 SubPlan 1
                                   ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=17.841..17.887 rows=35 loops=3)
                                         Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 0.751 ms
 JIT:
   Functions: 91
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 11.057 ms, Inlining 134.342 ms, Optimization 762.277 ms, Emission 451.708 ms, Total 1359.384 ms
 Execution Time: 2204.845 ms
(32 rows)
                                                                                             QUERY PLAN                                  
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=726143.96..726144.66 rows=6 width=1464) (actual time=5250.325..5305.144 rows=6 loops=1)
   ->  Gather Merge  (cost=726143.96..726351.17 rows=1776 width=1464) (actual time=4800.064..4854.881 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=725143.93..725146.15 rows=888 width=1464) (actual time=4762.610..4762.615 rows=5 loops=3)
               Sort Key: topic_links.clicks DESC, topic_links.created_at DESC
               Sort Method: top-N heapsort  Memory: 36kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 37kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 32kB
               ->  Nested Loop  (cost=574851.64..725128.02 rows=888 width=1464) (actual time=649.373..4710.853 rows=39385 loops=3)
                     ->  Parallel Hash Join  (cost=574847.40..695445.64 rows=2624 width=969) (actual time=630.974..3847.821 rows=336541 loops=3)
                           Hash Cond: (topic_links.post_id = posts.id)
                           ->  Parallel Bitmap Heap Scan on topic_links  (cost=11248.93..130753.84 rows=416505 width=186) (actual time=58.964..3095.540 rows=336541 loops=3)
                                 Recheck Cond: (user_id = 7237)
                                 Filter: ((NOT internal) AND (NOT reflection) AND (NOT quote))
                                 Rows Removed by Filter: 8
                                 Heap Blocks: exact=23544
                                 ->  Bitmap Index Scan on index_topic_links_on_user_id  (cost=0.00..10999.02 rows=1000879 width=0) (actual time=45.320..45.322 rows=1009648 loops=1)
                                       Index Cond: (user_id = 7237)
                           ->  Parallel Hash  (cost=562726.85..562726.85 rows=69730 width=783) (actual time=571.264..571.266 rows=60145 loops=3)
                                 Buckets: 262144  Batches: 1  Memory Usage: 71264kB
                                 ->  Parallel Bitmap Heap Scan on posts  (cost=1877.42..562726.85 rows=69730 width=783) (actual time=377.440..481.561 rows=60145 loops=3)
                                       Recheck Cond: ((user_id = 7237) AND (deleted_at IS NULL))
                                       Filter: (post_type = ANY ('{1,2,3,4}'::integer[]))
                                       Heap Blocks: exact=141396
                                       ->  Bitmap Index Scan on idx_posts_user_id_deleted_at  (cost=0.00..1835.58 rows=167352 width=0) (actual time=29.225..29.226 rows=180435 loops=1)
                                             Index Cond: (user_id = 7237)
                     ->  Index Scan using topics_pkey on topics  (cost=4.24..11.31 rows=1 width=495) (actual time=0.002..0.002 rows=0 loops=1009624)
                           Index Cond: (id = topic_links.topic_id)
                           Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                           Rows Removed by Filter: 1
                           SubPlan 1
                             ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=17.793..17.835 rows=35 loops=3)
                                   Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 4.526 ms
 JIT:
   Functions: 118
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 14.724 ms, Inlining 94.921 ms, Optimization 934.359 ms, Emission 525.383 ms, Total 1569.387 ms
 Execution Time: 5309.478 ms
(40 rows)

J’ai ajouté les index que j’avais proposés précédemment pour aider à résoudre ce problème (Slow Page Loads on User Profiles - #2 by Ghan), et cela a considérablement réduit les temps de chargement.

Plan d’exécution de la requête après l’ajout des index :

                                                                                                 QUERY PLAN                                                    
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1004.13..25136.02 rows=6 width=1473) (actual time=84.592..96.171 rows=6 loops=1)
   ->  Gather Merge  (cost=1004.13..9649737.42 rows=2399 width=1473) (actual time=84.591..96.168 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop  (cost=4.11..9648460.49 rows=1000 width=1473) (actual time=60.452..60.471 rows=4 loops=3)
               ->  Nested Loop  (cost=0.87..9619828.27 rows=2943 width=977) (actual time=0.115..47.798 rows=6058 loops=3)
                     ->  Parallel Index Scan using index_topic_links_on_clicks_and_created_desc on topic_links  (cost=0.43..6728256.06 rows=403188 width=186) (actual time=0.051..30.428 rows=6058 loops=3)
                           Filter: ((NOT internal) AND (NOT reflection) AND (NOT quote) AND (user_id = 7237))
                           Rows Removed by Filter: 54345
                     ->  Index Scan using posts_pkey on posts  (cost=0.44..7.17 rows=1 width=791) (actual time=0.002..0.002 rows=1 loops=18173)
                           Index Cond: (id = topic_links.post_id)
                           Filter: ((deleted_at IS NULL) AND (user_id = 7237) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
               ->  Index Scan using topics_pkey on topics  (cost=3.24..9.73 rows=1 width=496) (actual time=0.002..0.002 rows=0 loops=18173)
                     Index Cond: (id = topic_links.topic_id)
                     Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                     Rows Removed by Filter: 1
                     SubPlan 1
                       ->  Seq Scan on categories  (cost=0.00..2.74 rows=28 width=4) (actual time=0.031..0.049 rows=35 loops=3)
                             Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 1.205 ms
 Execution Time: 96.258 ms
(21 rows)

Voici les index au cas où quelqu’un souhaiterait les ajouter :

CREATE INDEX index_topic_links_on_clicks_and_created ON public.topic_links USING btree (clicks, created_at);
CREATE INDEX index_posts_on_like_count_and_created ON public.posts USING btree (like_count, created_at);
CREATE INDEX index_topic_links_on_clicks_and_created_desc ON public.topic_links USING btree (clicks DESC, created_at DESC);
CREATE INDEX index_posts_on_like_count_and_created_desc ON public.posts USING btree (like_count DESC, created_at DESC, user_id) WHERE deleted_at IS NULL AND post_number > 1 AND (post_type = ANY ('{1,2,3,4}'::integer[]));
1 « J'aime »

@codinghorror

ici et ici, vous avez demandé si ces index étaient manquants. Leur simple ajout fait une différence de jour et de nuit en termes de vitesse de chargement des profils. Ces index devraient absolument être envisagés.

Il est vrai que le cache ajouté ici : Slow Page Loads on User Profiles - #12 by codinghorror a fait une énorme différence, mais cela ne concerne que les chargements ultérieurs. Les index améliorent l’expérience initiale dans l’ensemble. Les utilisateurs ne devraient pas être confrontés à des temps de chargement lents, même si ces derniers s’améliorent grâce au cache pendant la période où il est actif. L’objectif était d’améliorer le tout premier chargement (lorsque le cache expire), et cela y parvient largement.

4 « J'aime »

Peut-être après les fêtes, pouvons-nous examiner ces index @sam @falco ?

2 « J'aime »

Je ne comprends pas pourquoi un index sur chaque ligne, trié par nombre de clics ou de likes, est nécessaire pour une page utilisateur limitée à un identifiant utilisateur.

2 « J'aime »

En ce qui concerne les clics, dans notre cas, c’est ici que se situe le gain :

->  Sort  (cost=725143.93..725146.15 rows=888 width=1464) (actual time=4762.610..4762.615 rows=5 loops=3)
               Sort Key: topic_links.clicks DESC, topic_links.created_at DESC

Ce tri prend 4 secondes, ce qui explique en partie pourquoi les profils se chargent si lentement. Il s’agit d’une seule requête parmi de nombreuses autres sur la page de profil (bien que l’une des plus coûteuses en termes de vitesse). Après l’ajout de l’index, le planificateur de requêtes le prend en compte ici, réduisant le temps d’exécution de cette sous-arbre à environ 30 ms :

 ->  Parallel Index Scan using index_topic_links_on_clicks_and_created_desc on topic_links  (cost=0.43..6728256.06 rows=403188 width=186) (actual time=0.051..30.428 rows=6058 loops=3)

Bien sûr, le plan de requête change également de manière, mais l’index est clairement référencé ici et la requête s’exécute beaucoup plus rapidement.

1 « J'aime »

Mais conceptuellement, tout ce qui nous importe, ce sont les liens limités à l’utilisateur. Pourquoi les nouveaux index proposés ne sont-ils pas limités aux utilisateurs ?

1 « J'aime »