100GB以上のデータベースでプロファイル読み込みが遅い

大規模インスタンスのチューニングに関する他のトピックもいくつかあります。大規模な PostgreSQL を運用する原則は MySQL と同じですが、細部にこそ落とし穴があります。

すでにインターネットやメタ上でこの件について検索しています。役立つと思われるトピックがあれば、ぜひ教えてください。

現在の app.yml には、以下の行がコメントアウトされています:

これらの行のコメントを解除し、値を適切に増やしてください。その後、再ビルドが必要です。

「いいね!」 3

これを実行し、コンテナを再構築しました。再インデックスを再度実行しようとしていますが、またフリーズしているようです。以下がその二つの出力です:

work_mem を上げる必要があるかどうかを確認する方法はありますか?ディスクへのアクセスがかなり激しい様子が見られるため、ファイルシステムでのソートなどが行われているのではないかと考えていますが、これが正しいのか、それとも別の問題が起きているのかを確認する方法がわかりません。

「いいね!」 1

OP で挙げたクエリを使用してください。それらを psql シェルで実行する際に先頭に EXPLAIN ANALYZE を付けて実行し、その出力結果をここに貼り付けてください。

「いいね!」 2

Looks like there’s some caching at work so subsequent loads tend to be faster, but > 5 seconds is still brutal when the cache expires.

                                                                                             QUERY PLAN                          
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=26396.41..26397.10 rows=6 width=1278) (actual time=90.879..114.499 rows=6 loops=1)
   ->  Gather Merge  (cost=26396.41..26448.85 rows=456 width=1278) (actual time=90.877..114.496 rows=6 loops=1)
         Workers Planned: 1
         Workers Launched: 1
         ->  Sort  (cost=25396.40..25397.54 rows=456 width=1278) (actual time=87.195..87.197 rows=5 loops=2)
               Sort Key: posts.like_count DESC, posts.created_at DESC
               Sort Method: top-N heapsort  Memory: 38kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 39kB
               ->  Nested Loop  (cost=103.30..25388.23 rows=456 width=1278) (actual time=3.200..83.497 rows=3610 loops=2)
                     ->  Parallel Bitmap Heap Scan on posts  (cost=99.06..9509.95 rows=1356 width=783) (actual time=3.075..56.880 rows=5932 loops=2)
                           Recheck Cond: (user_id = 27510)
                           Filter: ((deleted_at IS NULL) AND (post_number > 1) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
                           Rows Removed by Filter: 1071
                           Heap Blocks: exact=6272
                           ->  Bitmap Index Scan on index_posts_on_user_id_and_created_at  (cost=0.00..98.48 rows=2389 width=0) (actual time=3.916..3.917 rows=20157 loops=1)
                                 Index Cond: (user_id = 27510)
                     ->  Index Scan using topics_pkey on topics  (cost=4.24..11.71 rows=1 width=495) (actual time=0.004..0.004 rows=1 loops=11864)
                           Index Cond: (id = posts.topic_id)
                           Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                           Rows Removed by Filter: 0
                           SubPlan 1
                             ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=0.013..0.029 rows=35 loops=2)
                                   Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 2.535 ms
 Execution Time: 114.657 ms
(25 rows)
                                                                                          QUERY PLAN                             
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=25004.87..25004.87 rows=1 width=1464) (actual time=98.136..121.987 rows=6 loops=1)
   ->  Sort  (cost=25004.87..25004.87 rows=1 width=1464) (actual time=98.134..121.984 rows=6 loops=1)
         Sort Key: topic_links.clicks DESC, topic_links.created_at DESC
         Sort Method: top-N heapsort  Memory: 55kB
         ->  Nested Loop  (cost=1103.75..25004.86 rows=1 width=1464) (actual time=6.763..118.114 rows=3443 loops=1)
               ->  Gather  (cost=1099.51..24993.34 rows=1 width=969) (actual time=6.464..88.294 rows=9130 loops=1)
                     Workers Planned: 1
                     Workers Launched: 1
                     ->  Nested Loop  (cost=99.51..23993.24 rows=1 width=969) (actual time=3.151..73.939 rows=4565 loops=2)
                           ->  Parallel Bitmap Heap Scan on posts  (cost=99.08..9506.46 rows=1405 width=783) (actual time=3.032..43.325 rows=7003 loops=2)
                                 Recheck Cond: (user_id = 27510)
                                 Filter: ((deleted_at IS NULL) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
                                 Heap Blocks: exact=5953
                                 ->  Bitmap Index Scan on index_posts_on_user_id_and_created_at  (cost=0.00..98.48 rows=2389 width=0) (actual time=3.790..3.791 rows=20157 loops=1)
                                       Index Cond: (user_id = 27510)
                           ->  Index Scan using index_topic_links_on_post_id on topic_links  (cost=0.43..10.30 rows=1 width=186) (actual time=0.003..0.004 rows=1 loops=14006)
                                 Index Cond: (post_id = posts.id)
                                 Filter: ((NOT internal) AND (NOT reflection) AND (NOT quote) AND (user_id = 27510))
                                 Rows Removed by Filter: 0
               ->  Index Scan using topics_pkey on topics  (cost=4.24..11.52 rows=1 width=495) (actual time=0.003..0.003 rows=0 loops=9130)
                     Index Cond: (id = topic_links.topic_id)
                     Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                     Rows Removed by Filter: 1
                     SubPlan 1
                       ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=0.009..0.022 rows=35 loops=1)
                             Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 1.102 ms
 Execution Time: 122.098 ms
(28 rows)
SELECT "posts"."id" AS t0_r0, "posts"."user_id" AS t0_r1, "posts"."topic_id" AS t0_r2, "posts"."post_number" AS t0_r3, "posts"."raw" AS t0_r4, "posts"."cooked" AS t0_r5, "posts"."created_at" AS t0_r6, "posts"."updated_at" AS t0_r7, "posts"."reply_to_post_number" AS t0_r8, "posts"."reply_count" AS t0_r9, "posts"."quote_count" AS t0_r10, "posts"."deleted_at" AS t0_r11, "posts"."off_topic_count" AS t0_r12, "posts"."like_count" AS t0_r13, "posts"."incoming_link_count" AS t0_r14, "posts"."bookmark_count" AS t0_r15, "posts"."score" AS t0_r16, "posts"."reads" AS t0_r17, "posts"."post_type" AS t0_r18, "posts"."sort_order" AS t0_r19, "posts"."last_editor_id" AS t0_r20, "posts"."hidden" AS t0_r21, "posts"."hidden_reason_id" AS t0_r22, "posts"."notify_moderators_count" AS t0_r23, "posts"."spam_count" AS t0_r24, "posts"."illegal_count" AS t0_r25, "posts"."inappropriate_count" AS t0_r26, "posts"."last_version_at" AS t0_r27, "posts"."user_deleted" AS t0_r28, "posts"."reply_to_user_id" AS t0_r29, "posts"."percent_rank" AS t0_r30, "posts"."notify_user_count" AS t0_r31, "posts"."like_score" AS t0_r32, "posts"."deleted_by_id" AS t0_r33, "posts"."edit_reason" AS t0_r34, "posts"."word_count" AS t0_r35, "posts"."version" AS t0_r36, "posts"."cook_method" AS t0_r37, "posts"."wiki" AS t0_r38, "posts"."baked_at" AS t0_r39, "posts"."baked_version" AS t0_r40, "posts"."hidden_at" AS t0_r41, "posts"."self_edits" AS t0_r42, "posts"."reply_quoted" AS t0_r43, "posts"."via_email" AS t0_r44, "posts"."raw_email" AS t0_r45, "posts"."public_version" AS t0_r46, "posts"."action_code" AS t0_r47, "posts"."locked_by_id" AS t0_r48, "posts"."image_upload_id" AS t0_r49, "topics"."id" AS t1_r0, "topics"."title" AS t1_r1, "topics"."last_posted_at" AS t1_r2, "topics"."created_at" AS t1_r3, "topics"."updated_at" AS t1_r4, "topics"."views" AS t1_r5, "topics"."posts_count" AS t1_r6, "topics"."user_id" AS t1_r7, "topics"."last_post_user_id" AS t1_r8, "topics"."reply_count" AS t1_r9, "topics"."featured_user1_id" AS t1_r10, "topics"."featured_user2_id" AS t1_r11, "topics"."featured_user3_id" AS t1_r12, "topics"."deleted_at" AS t1_r13, "topics"."highest_post_number" AS t1_r14, "topics"."like_count" AS t1_r15, "topics"."incoming_link_count" AS t1_r16, "topics"."category_id" AS t1_r17, "topics"."visible" AS t1_r18, "topics"."moderator_posts_count" AS t1_r19, "topics"."closed" AS t1_r20, "topics"."archived" AS t1_r21, "topics"."bumped_at" AS t1_r22, "topics"."has_summary" AS t1_r23, "topics"."archetype" AS t1_r24, "topics"."featured_user4_id" AS t1_r25, "topics"."notify_moderators_count" AS t1_r26, "topics"."spam_count" AS t1_r27, "topics"."pinned_at" AS t1_r28, "topics"."score" AS t1_r29, "topics"."percent_rank" AS t1_r30, "topics"."subtype" AS t1_r31, "topics"."slug" AS t1_r32, "topics"."deleted_by_id" AS t1_r33, "topics"."participant_count" AS t1_r34, "topics"."word_count" AS t1_r35, "topics"."excerpt" AS t1_r36, "topics"."pinned_globally" AS t1_r37, "topics"."pinned_until" AS t1_r38, "topics"."fancy_title" AS t1_r39, "topics"."highest_staff_post_number" AS t1_r40, "topics"."featured_link" AS t1_r41, "topics"."reviewable_score" AS t1_r42, "topics"."image_upload_id" AS t1_r43, "topics"."slow_mode_seconds" AS t1_r44 FROM "posts" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) WHERE ("posts"."deleted_at" IS NULL) AND (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (3,4,5,17,19,25,26,27,28))) AND "posts"."user_id" = 27510 AND (post_number > 1) ORDER BY posts.like_count DESC, posts.created_at DESC LIMIT 6;
SELECT "topic_links"."id" AS t0_r0, "topic_links"."topic_id" AS t0_r1, "topic_links"."post_id" AS t0_r2, "topic_links"."user_id" AS t0_r3, "topic_links"."url" AS t0_r4, "topic_links"."domain" AS t0_r5, "topic_links"."internal" AS t0_r6, "topic_links"."link_topic_id" AS t0_r7, "topic_links"."created_at" AS t0_r8, "topic_links"."updated_at" AS t0_r9, "topic_links"."reflection" AS t0_r10, "topic_links"."clicks" AS t0_r11, "topic_links"."link_post_id" AS t0_r12, "topic_links"."title" AS t0_r13, "topic_links"."crawled_at" AS t0_r14, "topic_links"."quote" AS t0_r15, "topic_links"."extension" AS t0_r16, "topics"."id" AS t1_r0, "topics"."title" AS t1_r1, "topics"."last_posted_at" AS t1_r2, "topics"."created_at" AS t1_r3, "topics"."updated_at" AS t1_r4, "topics"."views" AS t1_r5, "topics"."posts_count" AS t1_r6, "topics"."user_id" AS t1_r7, "topics"."last_post_user_id" AS t1_r8, "topics"."reply_count" AS t1_r9, "topics"."featured_user1_id" AS t1_r10, "topics"."featured_user2_id" AS t1_r11, "topics"."featured_user3_id" AS t1_r12, "topics"."deleted_at" AS t1_r13, "topics"."highest_post_number" AS t1_r14, "topics"."like_count" AS t1_r15, "topics"."incoming_link_count" AS t1_r16, "topics"."category_id" AS t1_r17, "topics"."visible" AS t1_r18, "topics"."moderator_posts_count" AS t1_r19, "topics"."closed" AS t1_r20, "topics"."archived" AS t1_r21, "topics"."bumped_at" AS t1_r22, "topics"."has_summary" AS t1_r23, "topics"."archetype" AS t1_r24, "topics"."featured_user4_id" AS t1_r25, "topics"."notify_moderators_count" AS t1_r26, "topics"."spam_count" AS t1_r27, "topics"."pinned_at" AS t1_r28, "topics"."score" AS t1_r29, "topics"."percent_rank" AS t1_r30, "topics"."subtype" AS t1_r31, "topics"."slug" AS t1_r32, "topics"."deleted_by_id" AS t1_r33, "topics"."participant_count" AS t1_r34, "topics"."word_count" AS t1_r35, "topics"."excerpt" AS t1_r36, "topics"."pinned_globally" AS t1_r37, "topics"."pinned_until" AS t1_r38, "topics"."fancy_title" AS t1_r39, "topics"."highest_staff_post_number" AS t1_r40, "topics"."featured_link" AS t1_r41, "topics"."reviewable_score" AS t1_r42, "topics"."image_upload_id" AS t1_r43, "topics"."slow_mode_seconds" AS t1_r44, "posts"."id" AS t2_r0, "posts"."user_id" AS t2_r1, "posts"."topic_id" AS t2_r2, "posts"."post_number" AS t2_r3, "posts"."raw" AS t2_r4, "posts"."cooked" AS t2_r5, "posts"."created_at" AS t2_r6, "posts"."updated_at" AS t2_r7, "posts"."reply_to_post_number" AS t2_r8, "posts"."reply_count" AS t2_r9, "posts"."quote_count" AS t2_r10, "posts"."deleted_at" AS t2_r11, "posts"."off_topic_count" AS t2_r12, "posts"."like_count" AS t2_r13, "posts"."incoming_link_count" AS t2_r14, "posts"."bookmark_count" AS t2_r15, "posts"."score" AS t2_r16, "posts"."reads" AS t2_r17, "posts"."post_type" AS t2_r18, "posts"."sort_order" AS t2_r19, "posts"."last_editor_id" AS t2_r20, "posts"."hidden" AS t2_r21, "posts"."hidden_reason_id" AS t2_r22, "posts"."notify_moderators_count" AS t2_r23, "posts"."spam_count" AS t2_r24, "posts"."illegal_count" AS t2_r25, "posts"."inappropriate_count" AS t2_r26, "posts"."last_version_at" AS t2_r27, "posts"."user_deleted" AS t2_r28, "posts"."reply_to_user_id" AS t2_r29, "posts"."percent_rank" AS t2_r30, "posts"."notify_user_count" AS t2_r31, "posts"."like_score" AS t2_r32, "posts"."deleted_by_id" AS t2_r33, "posts"."edit_reason" AS t2_r34, "posts"."word_count" AS t2_r35, "posts"."version" AS t2_r36, "posts"."cook_method" AS t2_r37, "posts"."wiki" AS t2_r38, "posts"."baked_at" AS t2_r39, "posts"."baked_version" AS t2_r40, "posts"."hidden_at" AS t2_r41, "posts"."self_edits" AS t2_r42, "posts"."reply_quoted" AS t2_r43, "posts"."via_email" AS t2_r44, "posts"."raw_email" AS t2_r45, "posts"."public_version" AS t2_r46, "posts"."action_code" AS t2_r47, "posts"."locked_by_id" AS t2_r48, "posts"."image_upload_id" AS t2_r49 FROM "topic_links" INNER JOIN "topics" ON "topics"."id" = "topic_links"."topic_id" AND ("topics"."deleted_at" IS NULL) INNER JOIN "posts" ON "posts"."id" = "topic_links"."post_id" AND ("posts"."deleted_at" IS NULL) WHERE "posts"."user_id" = 27510 AND (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (3,4,5,17,19,25,26,27,28))) AND "topic_links"."user_id" = 27510 AND "topic_links"."internal" = FALSE AND "topic_links"."reflection" = FALSE AND "topic_links"."quote" = FALSE ORDER BY clicks DESC, topic_links.created_at DESC LIMIT 6; 

どちらも一時ファイル用にメモリを使用しており、実行時間は十分良好です。もしかして安定化しているのでしょうか?

この分析はキャッシュされたデータを参照している可能性があります。サイト内のサマリーページでは、はるかに長い実行時間が表示されています:

キャッシュが期限切れになるまで待ってから、後で再度分析を実行してみます。

reindex は 1 回だけ実行すれば十分です。インデックスが v13 のディスク形式になった後は、その形式のまま維持されます。

はい、VACUUM に ANALYZE が含まれていない場合、インデックスの活用ができていません。

「いいね!」 2

なるほど、このクエリは指定されたユーザーの「いいね」数が最も多い投稿を取得しているのですね。

ここでの見積もりは大きく外れていました。プランナーは 2,389 行と見積もっていたのに、実際には 20,157 行取得しています。

その一段上でも、5 倍もの誤差があります。

こうした誤見積もりが、クエリのパフォーマンスを著しく低下させる原因となります。

「いいね!」 2

ええと、私の知る限り、v13 ディスク形式以外の形式になることはなかったはずです。別のフォーラムソフトウェアからデータをインポートし、すでに v13 を実行していた新規の Discourse インストールに読み込みました。現在は、インポート後にすべてのインデックスが適切に再構築されているか確認しているところです。

なるほど。maintenance_work_mem を増やした後に再インデックスを再度実行しようとしています。postmaster が以前よりも多くのメモリを使用しているようですが、合理的な時間内に完了するかどうかはまだわかりません。

「いいね!」 2

再インデックスが停止しているのは、現在実行中のバッジジョブのようです。

2 時間以上も実行され続けているなんて、不自然ですね。

おそらく、これもまだ実行中の週次ジョブが原因かもしれません。

データベースの再インデックス中に Web サーバーを停止するには、以下を実行してください。

sv stop unicorn
「いいね!」 1

さて、インデックスの再構築がようやく合理的な時間内に完了しました。いくつかのインデックスは再構築できずスキップされたとのことで、その理由についての詳細はありませんでした。現在のテーブル統計は以下の通りです:

プロフィールの読み込み問題は悪化しているようです:

それらのクエリからの EXPLAIN 結果は以下の通りです:

                                                                                                QUERY PLAN                               
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=625609.73..625610.43 rows=6 width=1278) (actual time=2058.211..2154.991 rows=6 loops=1)
   ->  Gather Merge  (cost=625609.73..630928.70 rows=45588 width=1278) (actual time=1654.881..1751.658 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=624609.70..624666.69 rows=22794 width=1278) (actual time=1619.464..1619.469 rows=5 loops=3)
               Sort Key: posts.like_count DESC, posts.created_at DESC
               Sort Method: top-N heapsort  Memory: 37kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 37kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 38kB
               ->  Parallel Hash Join  (cost=63000.65..624201.13 rows=22794 width=1278) (actual time=1383.820..1600.913 rows=20578 loops=3)
                     Hash Cond: (posts.topic_id = topics.id)
                     ->  Parallel Bitmap Heap Scan on posts  (cost=1875.97..562899.73 rows=67322 width=783) (actual time=63.310..264.627 rows=20579 loops=3)
                           Recheck Cond: ((user_id = 7237) AND (deleted_at IS NULL))
                           Filter: ((post_number > 1) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
                           Rows Removed by Filter: 39566
                           Heap Blocks: exact=50390
                           ->  Bitmap Index Scan on idx_posts_user_id_deleted_at  (cost=0.00..1835.58 rows=167352 width=0) (actual time=36.507..36.508 rows=180435 loops=1)
                                 Index Cond: (user_id = 7237)
                     ->  Parallel Hash  (cost=59504.20..59504.20 rows=129638 width=495) (actual time=1319.362..1319.364 rows=131785 loops=3)
                           Buckets: 524288  Batches: 1  Memory Usage: 190560kB
                           ->  Parallel Seq Scan on topics  (cost=3.81..59504.20 rows=129638 width=495) (actual time=316.007..1204.917 rows=131785 loops=3)
                                 Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                                 Rows Removed by Filter: 174529
                                 SubPlan 1
                                   ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=17.841..17.887 rows=35 loops=3)
                                         Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 0.751 ms
 JIT:
   Functions: 91
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 11.057 ms, Inlining 134.342 ms, Optimization 762.277 ms, Emission 451.708 ms, Total 1359.384 ms
 Execution Time: 2204.845 ms
(32 rows)
                                                                                             QUERY PLAN                                  
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=726143.96..726144.66 rows=6 width=1464) (actual time=5250.325..5305.144 rows=6 loops=1)
   ->  Gather Merge  (cost=726143.96..726351.17 rows=1776 width=1464) (actual time=4800.064..4854.881 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=725143.93..725146.15 rows=888 width=1464) (actual time=4762.610..4762.615 rows=5 loops=3)
               Sort Key: topic_links.clicks DESC, topic_links.created_at DESC
               Sort Method: top-N heapsort  Memory: 36kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 37kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 32kB
               ->  Nested Loop  (cost=574851.64..725128.02 rows=888 width=1464) (actual time=649.373..4710.853 rows=39385 loops=3)
                     ->  Parallel Hash Join  (cost=574847.40..695445.64 rows=2624 width=969) (actual time=630.974..3847.821 rows=336541 loops=3)
                           Hash Cond: (topic_links.post_id = posts.id)
                           ->  Parallel Bitmap Heap Scan on topic_links  (cost=11248.93..130753.84 rows=416505 width=186) (actual time=58.964..3095.540 rows=336541 loops=3)
                                 Recheck Cond: (user_id = 7237)
                                 Filter: ((NOT internal) AND (NOT reflection) AND (NOT quote))
                                 Rows Removed by Filter: 8
                                 Heap Blocks: exact=23544
                                 ->  Bitmap Index Scan on index_topic_links_on_user_id  (cost=0.00..10999.02 rows=1000879 width=0) (actual time=45.320..45.322 rows=1009648 loops=1)
                                       Index Cond: (user_id = 7237)
                           ->  Parallel Hash  (cost=562726.85..562726.85 rows=69730 width=783) (actual time=571.264..571.266 rows=60145 loops=3)
                                 Buckets: 262144  Batches: 1  Memory Usage: 71264kB
                                 ->  Parallel Bitmap Heap Scan on posts  (cost=1877.42..562726.85 rows=69730 width=783) (actual time=377.440..481.561 rows=60145 loops=3)
                                       Recheck Cond: ((user_id = 7237) AND (deleted_at IS NULL))
                                       Filter: (post_type = ANY ('{1,2,3,4}'::integer[]))
                                       Heap Blocks: exact=141396
                                       ->  Bitmap Index Scan on idx_posts_user_id_deleted_at  (cost=0.00..1835.58 rows=167352 width=0) (actual time=29.225..29.226 rows=180435 loops=1)
                                             Index Cond: (user_id = 7237)
                     ->  Index Scan using topics_pkey on topics  (cost=4.24..11.31 rows=1 width=495) (actual time=0.002..0.002 rows=0 loops=1009624)
                           Index Cond: (id = topic_links.topic_id)
                           Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                           Rows Removed by Filter: 1
                           SubPlan 1
                             ->  Seq Scan on categories  (cost=0.00..3.74 rows=28 width=4) (actual time=17.793..17.835 rows=35 loops=3)
                                   Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 4.526 ms
 JIT:
   Functions: 118
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 14.724 ms, Inlining 94.921 ms, Optimization 934.359 ms, Emission 525.383 ms, Total 1569.387 ms
 Execution Time: 5309.478 ms
(40 rows)

以前提案したインデックスを追加してこの問題に対処しました(Slow Page Loads on User Profiles - #2 by Ghan

インデックス追加後のクエリプラン:

                                                                                                 QUERY PLAN                                                    
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1004.13..25136.02 rows=6 width=1473) (actual time=84.592..96.171 rows=6 loops=1)
   ->  Gather Merge  (cost=1004.13..9649737.42 rows=2399 width=1473) (actual time=84.591..96.168 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop  (cost=4.11..9648460.49 rows=1000 width=1473) (actual time=60.452..60.471 rows=4 loops=3)
               ->  Nested Loop  (cost=0.87..9619828.27 rows=2943 width=977) (actual time=0.115..47.798 rows=6058 loops=3)
                     ->  Parallel Index Scan using index_topic_links_on_clicks_and_created_desc on topic_links  (cost=0.43..6728256.06 rows=403188 width=186) (actual time=0.051..30.428 rows=6058 loops=3)
                           Filter: ((NOT internal) AND (NOT reflection) AND (NOT quote) AND (user_id = 7237))
                           Rows Removed by Filter: 54345
                     ->  Index Scan using posts_pkey on posts  (cost=0.44..7.17 rows=1 width=791) (actual time=0.002..0.002 rows=1 loops=18173)
                           Index Cond: (id = topic_links.post_id)
                           Filter: ((deleted_at IS NULL) AND (user_id = 7237) AND (post_type = ANY ('{1,2,3,4}'::integer[])))
               ->  Index Scan using topics_pkey on topics  (cost=3.24..9.73 rows=1 width=496) (actual time=0.002..0.002 rows=0 loops=18173)
                     Index Cond: (id = topic_links.topic_id)
                     Filter: ((deleted_at IS NULL) AND (deleted_at IS NULL) AND visible AND ((archetype)::text <> 'private_message'::text) AND ((category_id IS NULL) OR (hashed SubPlan 1)))
                     Rows Removed by Filter: 1
                     SubPlan 1
                       ->  Seq Scan on categories  (cost=0.00..2.74 rows=28 width=4) (actual time=0.031..0.049 rows=35 loops=3)
                             Filter: ((NOT read_restricted) OR (id = ANY ('{3,4,5,17,19,25,26,27,28}'::integer[])))
 Planning Time: 1.205 ms
 Execution Time: 96.258 ms
(21 rows)

もし追加したい方がいれば、使用したインデックスは以下の通りです:

CREATE INDEX index_topic_links_on_clicks_and_created ON public.topic_links USING btree (clicks, created_at);
CREATE INDEX index_posts_on_like_count_and_created ON public.posts USING btree (like_count, created_at);
CREATE INDEX index_topic_links_on_clicks_and_created_desc ON public.topic_links USING btree (clicks DESC, created_at DESC);
CREATE INDEX index_posts_on_like_count_and_created_desc ON public.posts USING btree (like_count DESC, created_at DESC, user_id) WHERE deleted_at IS NULL AND post_number > 1 AND (post_type = ANY ('{1,2,3,4}'::integer[]));
「いいね!」 1

@codinghorror

こちらこちらで、それらのインデックスが不足しているかどうかを尋ねていました。これらを追加するだけで、プロフィールの読み込み速度が劇的に改善されます。これらのインデックスの導入はぜひ検討すべきです。

確かに、こちらに追加されたキャッシュ(Slow Page Loads on User Profiles - #12 by codinghorror

「いいね!」 4

@sam @falco、年末年始明けにこれらのインデックスを見てみませんか?

「いいね!」 2

ユーザーIDにスコープされたユーザーページにおいて、クリック数やいいね数の順でソートされた各行にインデックスを付けることがなぜ必要なのか、理解できていません。

「いいね!」 2

クリック数に関しては、私たちのケースではここが改善のポイントのようです。

->  Sort  (cost=725143.93..725146.15 rows=888 width=1464) (actual time=4762.610..4762.615 rows=5 loops=3)
               Sort Key: topic_links.clicks DESC, topic_links.created_at DESC

このソート処理には 4 秒を要しており、プロフィールページの読み込みが遅い原因の一つとなっています。これはプロフィールページで実行される多くのクエリのうちの 1 つに過ぎませんが(速度面でのコストが最も高いものの一つではあります)、インデックスを追加したところ、クエリプランナーがこれを検知し、この部分のクエリ実行時間が約 30ms まで短縮されました。

 ->  Parallel Index Scan using index_topic_links_on_clicks_and_created_desc on topic_links  (cost=0.43..6728256.06 rows=403188 width=186) (actual time=0.051..30.428 rows=6058 loops=3)

もちろん、クエリプランは他の点でも変化していますが、インデックスが確実に参照されており、クエリの実行速度が大幅に向上しています。

「いいね!」 1

概念的には、ユーザーにスコープされたリンクだけが重要であるはずです。なぜ、新しく提案されているインデックスはユーザーにスコープされていないのでしょうか?

「いいね!」 1