Caricamenti lenti del profilo con database da oltre 100 GB

Ci sono alcuni altri argomenti riguardanti l’ottimizzazione di istanze di grandi dimensioni. I principi per eseguire un database PostgreSQL di grandi dimensioni sono gli stessi di MySQL, ma il diavolo si nasconde nei dettagli.

Stiamo già cercando su internet e qui su Meta. Potresti indicarci qualche argomento che ritieni possa essere utile?

Le righe del tuo attuale app.yml sono commentate:

Decommenta queste righe e aumenta i valori di conseguenza. Dopo di ciò, è necessario ricostruire.

3 Mi Piace

L’ho fatto e ho ricostruito il container. Sto cercando di eseguire di nuovo il reindicizzazione, ma sembra che si blocchi di nuovo. Ecco cosa ho per questi due:

C’è un modo per sapere se work_mem potrebbe dover essere aumentato? Vedo che il disco viene utilizzato in modo piuttosto intenso, quindi penso che possa essere in corso un ordinamento del filesystem o qualcosa di simile, ma non sono sicuro di come confermare se questo sia il caso o se stia accadendo qualcos’altro.

1 Mi Piace

Usa le query che hai elencato nel primo post. Esegui quelle in un terminale psql precedute da EXPLAIN ANALYZE e incolla qui l’output.

2 Mi Piace

Looks like there’s some caching at work so subsequent loads tend to be faster, but > 5 seconds is still brutal when the cache expires.

                                                                                             QUERY PLAN                          
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=26396.41..26397.10 rows=6 width=1278) (actual time=90.879..114.499 rows=6 loops=1)
   ->  Gather Merge  (cost=26396.41..26448.85 rows=456 width=1278) (actual time=90.877..114.496 rows=6 loops=1)
         Workers Planned: 1
         Workers Launched: 1
         ->  Sort  (cost=25396.40..25397.54 rows=456 width=1278) (actual time=87.195..87.197 rows=5 loops=2)
               Sort Key: posts.like_count DESC, posts.created_at DESC
               Sort Method: top-N heapsort  Memory: 38kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 39kB
               ->  Nested Loop  (cost=103.30..25388.23 rows=456 width=1278) (actual time=3.200..83.497 rows=3610 loops=2)
                     ->  Parallel Bitmap Heap Scan on posts  (cost=99.06..9509.95 rows=1356 width=783) (actual time=3.075..56.880 rows=5932 loops=2)
                           Recheck Cond: (user_id = 27510)
                           Filter: ((deleted_at IS NULL) AND (post_number > 1) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
                           Rows Removed by Filter: 1071
                           Heap Blocks: exact=6272
                           ->  Bitmap Index Scan on index_posts_on_user_id_and_created_at  (cost=0.00..98.48 rows=2389 width=0) (actual time=3.916..3.917 rows=20157 loops=1)
                                 Index Cond: (user_id = 27510)
                     ->  Index Scan using topics_pkey on topics  (cost=4.24..11.71 rows=1 width=495) (actual time=0.004..0.004 rows=1 loops=11864)
                           Index Cond: (id = posts.topic_id)
                           Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                           Rows Removed by Filter: 0
                           SubPlan 1
                             ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=0.013..0.029 rows=35 loops=2)
                                   Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 2.535 ms
 Execution Time: 114.657 ms
(25 rows)
                                                                                          QUERY PLAN                             
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=25004.87..25004.87 rows=1 width=1464) (actual time=98.136..121.987 rows=6 loops=1)
   ->  Sort  (cost=25004.87..25004.87 rows=1 width=1464) (actual time=98.134..121.984 rows=6 loops=1)
         Sort Key: topic_links.clicks DESC, topic_links.created_at DESC
         Sort Method: top-N heapsort  Memory: 55kB
         ->  Nested Loop  (cost=1103.75..25004.86 rows=1 width=1464) (actual time=6.763..118.114 rows=3443 loops=1)
               ->  Gather  (cost=1099.51..24993.34 rows=1 width=969) (actual time=6.464..88.294 rows=9130 loops=1)
                     Workers Planned: 1
                     Workers Launched: 1
                     ->  Nested Loop  (cost=99.51..23993.24 rows=1 width=969) (actual time=3.151..73.939 rows=4565 loops=2)
                           ->  Parallel Bitmap Heap Scan on posts  (cost=99.08..9506.46 rows=1405 width=783) (actual time=3.032..43.325 rows=7003 loops=2)
                                 Recheck Cond: (user_id = 27510)
                                 Filter: ((deleted_at IS NULL) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
                                 Heap Blocks: exact=5953
                                 ->  Bitmap Index Scan on index_posts_on_user_id_and_created_at  (cost=0.00..98.48 rows=2389 width=0) (actual time=3.790..3.791 rows=20157 loops=1)
                                       Index Cond: (user_id = 27510)
                           ->  Index Scan using index_topic_links_on_post_id on topic_links  (cost=0.43..10.30 rows=1 width=186) (actual time=0.003..0.004 rows=1 loops=14006)
                                 Index Cond: (post_id = posts.id)
                                 Filter: ((NOT internal) AND (NOT reflection) AND (NOT quote) AND (user_id = 27510))
                                 Rows Removed by Filter: 0
               ->  Index Scan using topics_pkey on topics  (cost=4.24..11.52 rows=1 width=495) (actual time=0.003..0.003 rows=0 loops=9130)
                     Index Cond: (id = topic_links.topic_id)
                     Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                     Rows Removed by Filter: 1
                     SubPlan 1
                       ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=0.009..0.022 rows=35 loops=1)
                             Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 1.102 ms
 Execution Time: 122.098 ms
(28 rows)
SELECT "posts"."id" AS t0_r0, "posts"."user_id" AS t0_r1, "posts"."topic_id" AS t0_r2, "posts"."post_number" AS t0_r3, "posts"."raw" AS t0_r4, "posts"."cooked" AS t0_r5, "posts"."created_at" AS t0_r6, "posts"."updated_at" AS t0_r7, "posts"."reply_to_post_number" AS t0_r8, "posts"."reply_count" AS t0_r9, "posts"."quote_count" AS t0_r10, "posts"."deleted_at" AS t0_r11, "posts"."off_topic_count" AS t0_r12, "posts"."like_count" AS t0_r13, "posts"."incoming_link_count" AS t0_r14, "posts"."bookmark_count" AS t0_r15, "posts"."score" AS t0_r16, "posts"."reads" AS t0_r17, "posts"."post_type" AS t0_r18, "posts"."sort_order" AS t0_r19, "posts"."last_editor_id" AS t0_r20, "posts"."hidden" AS t0_r21, "posts"."hidden_reason_id" AS t0_r22, "posts"."notify_moderators_count" AS t0_r23, "posts"."spam_count" AS t0_r24, "posts"."illegal_count" AS t0_r25, "posts"."inappropriate_count" AS t0_r26, "posts"."last_version_at" AS t0_r27, "posts"."user_deleted" AS t0_r28, "posts"."reply_to_user_id" AS t0_r29, "posts"."percent_rank" AS t0_r30, "posts"."notify_user_count" AS t0_r31, "posts"."like_score" AS t0_r32, "posts"."deleted_by_id" AS t0_r33, "posts"."edit_reason" AS t0_r34, "posts"."word_count" AS t0_r35, "posts"."version" AS t0_r36, "posts"."cook_method" AS t0_r37, "posts"."wiki" AS t0_r38, "posts"."baked_at" AS t0_r39, "posts"."baked_version" AS t0_r40, "posts"."hidden_at" AS t0_r41, "posts"."self_edits" AS t0_r42, "posts"."reply_quoted" AS t0_r43, "posts"."via_email" AS t0_r44, "posts"."raw_email" AS t0_r45, "posts"."public_version" AS t0_r46, "posts"."action_code" AS t0_r47, "posts"."locked_by_id" AS t0_r48, "posts"."image_upload_id" AS t0_r49, "topics"."id" AS t1_r0, "topics"."title" AS t1_r1, "topics"."last_posted_at" AS t1_r2, "topics"."created_at" AS t1_r3, "topics"."updated_at" AS t1_r4, "topics"."views" AS t1_r5, "topics"."posts_count" AS t1_r6, "topics"."user_id" AS t1_r7, "topics"."last_post_user_id" AS t1_r8, "topics"."reply_count" AS t1_r9, "topics"."featured_user1_id" AS t1_r10, "topics"."featured_user2_id" AS t1_r11, "topics"."featured_user3_id" AS t1_r12, "topics"."deleted_at" AS t1_r13, "topics"."highest_post_number" AS t1_r14, "topics"."like_count" AS t1_r15, "topics"."incoming_link_count" AS t1_r16, "topics"."category_id" AS t1_r17, "topics"."visible" AS t1_r18, "topics"."moderator_posts_count" AS t1_r19, "topics"."closed" AS t1_r20, "topics"."archived" AS t1_r21, "topics"."bumped_at" AS t1_r22, "topics"."has_summary" AS t1_r23, "topics"."archetype" AS t1_r24, "topics"."featured_user4_id" AS t1_r25, "topics"."notify_moderators_count" AS t1_r26, "topics"."spam_count" AS t1_r27, "topics"."pinned_at" AS t1_r28, "topics"."score" AS t1_r29, "topics"."percent_rank" AS t1_r30, "topics"."subtype" AS t1_r31, "topics"."slug" AS t1_r32, "topics"."deleted_by_id" AS t1_r33, "topics"."participant_count" AS t1_r34, "topics"."word_count" AS t1_r35, "topics"."excerpt" AS t1_r36, "topics"."pinned_globally" AS t1_r37, "topics"."pinned_until" AS t1_r38, "topics"."fancy_title" AS t1_r39, "topics"."highest_staff_post_number" AS t1_r40, "topics"."featured_link" AS t1_r41, "topics"."reviewable_score" AS t1_r42, "topics"."image_upload_id" AS t1_r43, "topics"."slow_mode_seconds" AS t1_r44 FROM "posts" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) WHERE ("posts"."deleted_at" IS NULL) AND (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (3,4,5,17,19,25,26,27,28))) AND "posts"."user_id" = 27510 AND (post_number > 1) ORDER BY posts.like_count DESC, posts.created_at DESC LIMIT 6;
SELECT "topic_links"."id" AS t0_r0, "topic_links"."topic_id" AS t0_r1, "topic_links"."post_id" AS t0_r2, "topic_links"."user_id" AS t0_r3, "topic_links"."url" AS t0_r4, "topic_links"."domain" AS t0_r5, "topic_links"."internal" AS t0_r6, "topic_links"."link_topic_id" AS t0_r7, "topic_links"."created_at" AS t0_r8, "topic_links"."updated_at" AS t0_r9, "topic_links"."reflection" AS t0_r10, "topic_links"."clicks" AS t0_r11, "topic_links"."link_post_id" AS t0_r12, "topic_links"."title" AS t0_r13, "topic_links"."crawled_at" AS t0_r14, "topic_links"."quote" AS t0_r15, "topic_links"."extension" AS t0_r16, "topics"."id" AS t1_r0, "topics"."title" AS t1_r1, "topics"."last_posted_at" AS t1_r2, "topics"."created_at" AS t1_r3, "topics"."updated_at" AS t1_r4, "topics"."views" AS t1_r5, "topics"."posts_count" AS t1_r6, "topics"."user_id" AS t1_r7, "topics"."last_post_user_id" AS t1_r8, "topics"."reply_count" AS t1_r9, "topics"."featured_user1_id" AS t1_r10, "topics"."featured_user2_id" AS t1_r11, "topics"."featured_user3_id" AS t1_r12, "topics"."deleted_at" AS t1_r13, "topics"."highest_post_number" AS t1_r14, "topics"."like_count" AS t1_r15, "topics"."incoming_link_count" AS t1_r16, "topics"."category_id" AS t1_r17, "topics"."visible" AS t1_r18, "topics"."moderator_posts_count" AS t1_r19, "topics"."closed" AS t1_r20, "topics"."archived" AS t1_r21, "topics"."bumped_at" AS t1_r22, "topics"."has_summary" AS t1_r23, "topics"."archetype" AS t1_r24, "topics"."featured_user4_id" AS t1_r25, "topics"."notify_moderators_count" AS t1_r26, "topics"."spam_count" AS t1_r27, "topics"."pinned_at" AS t1_r28, "topics"."score" AS t1_r29, "topics"."percent_rank" AS t1_r30, "topics"."subtype" AS t1_r31, "topics"."slug" AS t1_r32, "topics"."deleted_by_id" AS t1_r33, "topics"."participant_count" AS t1_r34, "topics"."word_count" AS t1_r35, "topics"."excerpt" AS t1_r36, "topics"."pinned_globally" AS t1_r37, "topics"."pinned_until" AS t1_r38, "topics"."fancy_title" AS t1_r39, "topics"."highest_staff_post_number" AS t1_r40, "topics"."featured_link" AS t1_r41, "topics"."reviewable_score" AS t1_r42, "topics"."image_upload_id" AS t1_r43, "topics"."slow_mode_seconds" AS t1_r44, "posts"."id" AS t2_r0, "posts"."user_id" AS t2_r1, "posts"."topic_id" AS t2_r2, "posts"."post_number" AS t2_r3, "posts"."raw" AS t2_r4, "posts"."cooked" AS t2_r5, "posts"."created_at" AS t2_r6, "posts"."updated_at" AS t2_r7, "posts"."reply_to_post_number" AS t2_r8, "posts"."reply_count" AS t2_r9, "posts"."quote_count" AS t2_r10, "posts"."deleted_at" AS t2_r11, "posts"."off_topic_count" AS t2_r12, "posts"."like_count" AS t2_r13, "posts"."incoming_link_count" AS t2_r14, "posts"."bookmark_count" AS t2_r15, "posts"."score" AS t2_r16, "posts"."reads" AS t2_r17, "posts"."post_type" AS t2_r18, "posts"."sort_order" AS t2_r19, "posts"."last_editor_id" AS t2_r20, "posts"."hidden" AS t2_r21, "posts"."hidden_reason_id" AS t2_r22, "posts"."notify_moderators_count" AS t2_r23, "posts"."spam_count" AS t2_r24, "posts"."illegal_count" AS t2_r25, "posts"."inappropriate_count" AS t2_r26, "posts"."last_version_at" AS t2_r27, "posts"."user_deleted" AS t2_r28, "posts"."reply_to_user_id" AS t2_r29, "posts"."percent_rank" AS t2_r30, "posts"."notify_user_count" AS t2_r31, "posts"."like_score" AS t2_r32, "posts"."deleted_by_id" AS t2_r33, "posts"."edit_reason" AS t2_r34, "posts"."word_count" AS t2_r35, "posts"."version" AS t2_r36, "posts"."cook_method" AS t2_r37, "posts"."wiki" AS t2_r38, "posts"."baked_at" AS t2_r39, "posts"."baked_version" AS t2_r40, "posts"."hidden_at" AS t2_r41, "posts"."self_edits" AS t2_r42, "posts"."reply_quoted" AS t2_r43, "posts"."via_email" AS t2_r44, "posts"."raw_email" AS t2_r45, "posts"."public_version" AS t2_r46, "posts"."action_code" AS t2_r47, "posts"."locked_by_id" AS t2_r48, "posts"."image_upload_id" AS t2_r49 FROM "topic_links" INNER JOIN "topics" ON "topics"."id" = "topic_links"."topic_id" AND ("topics"."deleted_at" IS NULL) INNER JOIN "posts" ON "posts"."id" = "topic_links"."post_id" AND ("posts"."deleted_at" IS NULL) WHERE "posts"."user_id" = 27510 AND (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (3,4,5,17,19,25,26,27,28))) AND "topic_links"."user_id" = 27510 AND "topic_links"."internal" = FALSE AND "topic_links"."reflection" = FALSE AND "topic_links"."quote" = FALSE ORDER BY clicks DESC, topic_links.created_at DESC LIMIT 6; 

Entrambi utilizzano memoria per file temporanei e i tempi di esecuzione sono sufficientemente buoni. Forse si sta stabilizzando?

Credo che questa analisi stia recuperando alcuni dati memorizzati nella cache. La pagina di riepilogo del sito mostra tempi di esecuzione molto più lunghi:

Proverò a rieseguire l’analisi più tardi, una volta che la cache sarà probabilmente scaduta.

Il reindicizzazione dovrebbe essere necessario solo una volta: una volta che gli indici sono nel formato del disco v13, rimangono nel formato del disco v13.

Sì, se il tuo VACUUM non includeva ANALYZE, stai perdendo l’utilizzo degli indici.

2 Mi Piace

Ok, quindi questa query recupera i post più apprezzati per un determinato utente.

La stima qui era completamente sbagliata: il pianificatore ha previsto 2.389 righe ma ne ha ottenute 20.157.

Un livello più su, un’altra discrepanza di 5 volte.

Sono proprio questo tipo di stime errate a causare prestazioni terribili delle query.

2 Mi Piace

In realtà, per quanto ne so, non dovrebbero mai essere stati in nessun formato diverso da quello del disco v13. Abbiamo importato i dati da un altro software di forum su una nuova installazione di Discourse che già eseguiva la versione v13. Al momento sto solo cercando di assicurarmi che tutti gli indici siano stati ricostruiti correttamente dopo l’importazione.

Ha senso. Sto cercando di eseguire di nuovo la reindicizzazione dopo aver aumentato maintenance_work_mem. Sembra che postmaster stia utilizzando più memoria rispetto alla volta precedente, ma non so ancora se l’operazione si completerà in un tempo ragionevole.

2 Mi Piace

Sembra che ci sia un’attività di badge in esecuzione che sta bloccando la reindicizzazione.

Sta funzionando da oltre due ore? Sembra strano.

Potrebbe esserci qualcosa in questo lavoro settimanale che è ancora in esecuzione:

Potresti eseguire

sv stop unicorn

per arrestare il server web durante la ricreazione dell’indice del database.

1 Mi Piace

Ok, quindi la reindicizzazione è finalmente completata in un tempo ragionevole. Alcuni indici non sono stati ricostruiti e sono stati saltati, ma non sono state fornite ulteriori dettagli sul motivo. Ecco le statistiche delle tabelle ora:

Il problema del caricamento del profilo sembra essere peggiorato:

Ecco gli EXPLAIN di queste query:

                                                                                                QUERY PLAN                               
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=625609.73..625610.43 rows=6 width=1278) (actual time=2058.211..2154.991 rows=6 loops=1)
   ->  Gather Merge  (cost=625609.73..630928.70 rows=45588 width=1278) (actual time=1654.881..1751.658 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=624609.70..624666.69 rows=22794 width=1278) (actual time=1619.464..1619.469 rows=5 loops=3)
               Sort Key: posts.like_count DESC, posts.created_at DESC
               Sort Method: top-N heapsort  Memory: 37kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 37kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 38kB
               ->  Parallel Hash Join  (cost=63000.65..624201.13 rows=22794 width=1278) (actual time=1383.820..1600.913 rows=20578 loops=3)
                     Hash Cond: (posts.topic_id = topics.id)
                     ->  Parallel Bitmap Heap Scan on posts  (cost=1875.97..562899.73 rows=67322 width=783) (actual time=63.310..264.627 rows=20579 loops=3)
                           Recheck Cond: ((user_id = 7237) AND (deleted_at IS NULL))
                           Filter: ((post_number > 1) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
                           Rows Removed by Filter: 39566
                           Heap Blocks: exact=50390
                           ->  Bitmap Index Scan on idx_posts_user_id_deleted_at  (cost=0.00..1835.58 rows=167352 width=0) (actual time=36.507..36.508 rows=180435 loops=1)
                                 Index Cond: (user_id = 7237)
                     ->  Parallel Hash  (cost=59504.20..59504.20 rows=129638 width=495) (actual time=1319.362..1319.364 rows=131785 loops=3)
                           Buckets: 524288  Batches: 1  Memory Usage: 190560kB
                           ->  Parallel Seq Scan on topics  (cost=3.81..59504.20 rows=129638 width=495) (actual time=316.007..1204.917 rows=131785 loops=3)
                                 Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                                 Rows Removed by Filter: 174529
                                 SubPlan 1
                                   ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=17.841..17.887 rows=35 loops=3)
                                         Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 0.751 ms
 JIT:
   Functions: 91
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 11.057 ms, Inlining 134.342 ms, Optimization 762.277 ms, Emission 451.708 ms, Total 1359.384 ms
 Execution Time: 2204.845 ms
(32 rows)
                                                                                             QUERY PLAN                                  
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=726143.96..726144.66 rows=6 width=1464) (actual time=5250.325..5305.144 rows=6 loops=1)
   ->  Gather Merge  (cost=726143.96..726351.17 rows=1776 width=1464) (actual time=4800.064..4854.881 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=725143.93..725146.15 rows=888 width=1464) (actual time=4762.610..4762.615 rows=5 loops=3)
               Sort Key: topic_links.clicks DESC, topic_links.created_at DESC
               Sort Method: top-N heapsort  Memory: 36kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 37kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 32kB
               ->  Nested Loop  (cost=574851.64..725128.02 rows=888 width=1464) (actual time=649.373..4710.853 rows=39385 loops=3)
                     ->  Parallel Hash Join  (cost=574847.40..695445.64 rows=2624 width=969) (actual time=630.974..3847.821 rows=336541 loops=3)
                           Hash Cond: (topic_links.post_id = posts.id)
                           ->  Parallel Bitmap Heap Scan on topic_links  (cost=11248.93..130753.84 rows=416505 width=186) (actual time=58.964..3095.540 rows=336541 loops=3)
                                 Recheck Cond: (user_id = 7237)
                                 Filter: ((NOT internal) AND (NOT reflection) AND (NOT quote))
                                 Rows Removed by Filter: 8
                                 Heap Blocks: exact=23544
                                 ->  Bitmap Index Scan on index_topic_links_on_user_id  (cost=0.00..10999.02 rows=1000879 width=0) (actual time=45.320..45.322 rows=1009648 loops=1)
                                       Index Cond: (user_id = 7237)
                           ->  Parallel Hash  (cost=562726.85..562726.85 rows=69730 width=783) (actual time=571.264..571.266 rows=60145 loops=3)
                                 Buckets: 262144  Batches: 1  Memory Usage: 71264kB
                                 ->  Parallel Bitmap Heap Scan on posts  (cost=1877.42..562726.85 rows=69730 width=783) (actual time=377.440..481.561 rows=60145 loops=3)
                                       Recheck Cond: ((user_id = 7237) AND (deleted_at IS NULL))
                                       Filter: (post_type = ANY ('{1,2,3,4}'::integer[]))
                                       Heap Blocks: exact=141396
                                       ->  Bitmap Index Scan on idx_posts_user_id_deleted_at  (cost=0.00..1835.58 rows=167352 width=0) (actual time=29.225..29.226 rows=180435 loops=1)
                                             Index Cond: (user_id = 7237)
                     ->  Index Scan using topics_pkey on topics  (cost=4.24..11.31 rows=1 width=495) (actual time=0.002..0.002 rows=0 loops=1009624)
                           Index Cond: (id = topic_links.topic_id)
                           Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                           Rows Removed by Filter: 1
                           SubPlan 1
                             ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=17.793..17.835 rows=35 loops=3)
                                   Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 4.526 ms
 JIT:
   Functions: 118
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 14.724 ms, Inlining 94.921 ms, Optimization 934.359 ms, Emission 525.383 ms, Total 1569.387 ms
 Execution Time: 5309.478 ms
(40 rows)

Ho aggiunto gli indici che avevo precedentemente ideato per aiutare in questo caso (Slow Page Loads on User Profiles - #2 by Ghan) e questo ha ridotto notevolmente i tempi di caricamento.

Piano di query dopo l’aggiunta degli indici:

                                                                                                 QUERY PLAN                                                    
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1004.13..25136.02 rows=6 width=1473) (actual time=84.592..96.171 rows=6 loops=1)
   ->  Gather Merge  (cost=1004.13..9649737.42 rows=2399 width=1473) (actual time=84.591..96.168 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop  (cost=4.11..9648460.49 rows=1000 width=1473) (actual time=60.452..60.471 rows=4 loops=3)
               ->  Nested Loop  (cost=0.87..9619828.27 rows=2943 width=977) (actual time=0.115..47.798 rows=6058 loops=3)
                     ->  Parallel Index Scan using index_topic_links_on_clicks_and_created_desc on topic_links  (cost=0.43..6728256.06 rows=403188 width=186) (actual time=0.051..30.428 rows=6058 loops=3)
                           Filter: ((NOT internal) AND (NOT reflection) AND (NOT quote) AND (user_id = 7237))
                           Rows Removed by Filter: 54345
                     ->  Index Scan using posts_pkey on posts  (cost=0.44..7.17 rows=1 width=791) (actual time=0.002..0.002 rows=1 loops=18173)
                           Index Cond: (id = topic_links.post_id)
                           Filter: ((deleted_at IS NULL) AND (user_id = 7237) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
               ->  Index Scan using topics_pkey on topics  (cost=3.24..9.73 rows=1 width=496) (actual time=0.002..0.002 rows=0 loops=18173)
                     Index Cond: (id = topic_links.topic_id)
                     Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                     Rows Removed by Filter: 1
                     SubPlan 1
                       ->  Seq Scan on categories  (cost=0.00..2.74 rows=28 width=4) (actual time=0.031..0.049 rows=35 loops=3)
                             Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 1.205 ms
 Execution Time: 96.258 ms
(21 rows)

Ecco gli indici, nel caso qualcuno volesse aggiungerli:

CREATE INDEX index_topic_links_on_clicks_and_created ON public.topic_links USING btree (clicks, created_at);
CREATE INDEX index_posts_on_like_count_and_created ON public.posts USING btree (like_count, created_at);
CREATE INDEX index_topic_links_on_clicks_and_created_desc ON public.topic_links USING btree (clicks DESC, created_at DESC);
CREATE INDEX index_posts_on_like_count_and_created_desc ON public.posts USING btree (like_count DESC, created_at DESC, user_id) WHERE deleted_at IS NULL AND post_number > 1 AND (post_type = ANY ('{1,2,3,4}'::integer[]));
1 Mi Piace

@codinghorror

Qui e qui hai chiesto se quegli indici mancavano. Aggiungerli fa una differenza enorme, quasi notte e giorno, per la velocità di caricamento dei profili. Questi indici dovrebbero essere assolutamente presi in considerazione.

È vero che la cache aggiunta qui: Slow Page Loads on User Profiles - #12 by codinghorror ha fatto una differenza enorme, ma solo nei caricamenti successivi; gli indici migliorano l’esperienza iniziale in generale. Gli utenti non dovrebbero essere costretti ad attendere tempi di caricamento lenti, anche se questi migliorano grazie alla cache per il periodo in cui è attiva. L’obiettivo era migliorare il primo caricamento iniziale (ogni volta che la cache scade) e questo lo realizza in modo eccellente.

4 Mi Piace

Forse dopo le feste possiamo dare un’occhiata a questi indici @sam @falco?

2 Mi Piace

Non è chiaro perché sia necessario un indice su ogni singola riga, ordinata in base al numero di clic o al conteggio dei like, per una pagina utente limitata a un ID utente.

2 Mi Piace

Ad esempio, per quanto riguarda i click, nel nostro caso sembra che il guadagno sia qui:

->  Sort  (cost=725143.93..725146.15 rows=888 width=1464) (actual time=4762.610..4762.615 rows=5 loops=3)
               Sort Key: topic_links.clicks DESC, topic_links.created_at DESC

Questo ordinamento richiede 4 secondi, il che contribuisce al motivo per cui i profili vengono caricati così lentamente. Questa è solo una query tra molte presenti nella pagina del profilo (anche se una delle più costose in termini di velocità). Dopo aver aggiunto l’indice, il pianificatore delle query lo rileva qui, riducendo il tempo di esecuzione della query per questo sottoalbero a circa 30 ms:

 ->  Parallel Index Scan using index_topic_links_on_clicks_and_created_desc on topic_links  (cost=0.43..6728256.06 rows=403188 width=186) (actual time=0.051..30.428 rows=6058 loops=3)

Ovviamente il piano di esecuzione cambia anche in altri modi, ma l’indice viene sicuramente utilizzato qui e la query viene eseguita molto più velocemente.

1 Mi Piace

Ma concettualmente ci interessa solo che i link siano limitati all’utente: perché i nuovi indici proposti non sono limitati agli utenti?

1 Mi Piace