Temps de chargement longs pour la page de résumé utilisateur avec une base de données lente

Bonjour,

Nous rencontrons d’énormes temps de chargement lors de l’accès à la page de résumé d’un utilisateur. Il semble que le chargement du résumé prenne d’autant plus de temps que l’utilisateur est actif sur notre site. Ce qui, oui, semble assez évident, mais la différence de temps de chargement sur notre site est immense.
Ainsi, charger la page de résumé d’un « nouveau » utilisateur (quelques semaines d’activité) prend environ 1 seconde, tandis que charger la page d’un utilisateur ayant une activité quotidienne depuis des années prend jusqu’à 30-40 secondes.

Nous avons donc examiné la base de données et avons été assez impressionnés par la taille de la table topic_views (incluant les index), qui atteint 62 Go.

Voici nos 7 plus grandes tables :

relation total_size table_size index_size reltuples
topic_views 62 Go 17 Go 45 Go 399 215 000
posts 27 Go 22 Go 5 022 Mo 9 123 860
top_topics 15 Go 708 Mo 15 Go 1 602 180
post_search_data 15 Go 12 Go 2 849 Mo 8 181 010
incoming_links 12 Go 5 330 Mo 6 647 Mo 90 008 900
user_actions 3 184 Mo 877 Mo 2 307 Mo 11 872 800
topics 2 117 Mo 1 035 Mo 1 082 Mo 1 613 070

Pour tester cela, nous avons sélectionné l’ID d’un utilisateur dont nous savons que le chargement du résumé prend beaucoup de temps, et avons exécuté un comptage avec cet ID sur la table topic_views. Cette requête a pris presque le même temps que le chargement du résumé de l’utilisateur. La partie topics viewed est donc responsable de ce délai.

Comme expliqué dans des sujets antérieurs, nous avons migré un ancien site basé sur Discourse (v1.2.4 + modifications) vers la version v2.4.0-beta.2, et utilisons maintenant Discourse v2.4.0-beta6. Il pourrait donc rester certaines données résiduelles. Cependant, nous ne savons pas ce qui peut être supprimé.

À ce stade, nous sommes assez incertains quant à la prochaine étape ou aux ajustements à effectuer. Notre installation fonctionne entièrement sur Azure, avec une instance externe d’Azure Database for PostgreSQL, et nous avons appliqué la configuration décrite dans votre fichier postgres.10.template.yml. Nous ne savons même pas si la taille de la table topic_views (en particulier l’index) est « trop grande » par rapport aux autres tables ou si cela est tout à fait normal.

Toute indication, réflexion ou idée est la bienvenue.

Cordialement,
Sascha Hofmann

P.S.
J’ai oublié de mentionner que, à cause de cela, nous rencontrons plusieurs erreurs 502 (affichées directement sur le site, donc pas de page d’erreur nginx ou de passerelle d’application) lors du chargement de /u/<username>/summary.json.

7 « J'aime »

Are you viewing it as admin / staff or as anon? The staff view is far more expensive.

2 « J'aime »

It makes no difference if I’m viewing /u/<username>/summary as admin/staff or with my normal user account. We also have hide user profiles from public enabled so user profiles are not visible for anon (disabled it for testing but makes no difference as topics viewed is also present in summary).

2 « J'aime »

Have you migrated, modified or rebuilt the same database over and over? It’s possible your database needs a vacuum to update the statistics. 30s queries are way out of the normal range we see in Discourse installs (although admittedly your DB is large.)

VACUUM FULL is worth trying if you can accept some downtime.

4 « J'aime »

Also is your database on fast ssd storage with a LOT of ram? That is critical.

1 « J'aime »

In order to know exactly what is the problem:

  1. Edit the app.yml file and add your admin user email to DISCOURSE_DEVELOPER_EMAILS.

  2. Rebuild

  3. Load the slow page while logged in as the user set in step 1.

  4. Share the result of the mini-profiler

Share the contents from this and the expanded window:

6 « J'aime »

We’ve tested it several times but always with a fresh copy of the old dataset. After a couple of weeks and as soon we we’re sure that we’ve reverted the changes of the previous owner successfully, we’ve run the upgrade and migration once.

That’s definitely worth a try but I’m not quite sure how long this will take. But yes, we should consider a full vacuum. Especially as we had some problems with the older database already.

Yep. All of the data is stored onto a server ssd and backed by 40GB RAM (of which ~60% are used overall without plateaus at 100% on higher loads).

1 « J'aime »

Thanks for this advice, I’ll give it a try.
As we use our own docker-image, where app.yml is missing, we should achieve the same behaviour with adding the mail-address(es) to developer_emails in the discourse.conf.

You can go into the rails console and use Developers.create!(user_id: 123) where 123 is your user id.

4 « J'aime »

Unfortunately it’s not working (yes I’ve replaced 123 before :slight_smile:) :

discourse@2d5c3bf04550:~$ rails console --environment=production
Loading production environment (Rails 6.0.0)
irb(main):001:0> Developers.create!(user_id: 123)
Traceback (most recent call last):
        1: from (irb):1
NameError (uninitialized constant Developers)

Another approach:

discourse@2d5c3bf04550:~$ rails runner --environment=production make_me_dev.rb
Traceback (most recent call last):
/var/www/discourse/bin/rails: Bootsnap::LoadPathCache::FallbackScan
	19: from /var/www/discourse/bin/rails:17:in `<main>'
	18: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/activesupport-6.0.0/lib/active_support/dependencies.rb:325:in `require'
	17: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/activesupport-6.0.0/lib/active_support/dependencies.rb:291:in `load_dependency'
	16: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/activesupport-6.0.0/lib/active_support/dependencies.rb:325:in `block in require'
	15: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/bootsnap-1.4.4/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
	14: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/bootsnap-1.4.4/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:21:in `require_with_bootsnap_lfi'
	13: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/bootsnap-1.4.4/lib/bootsnap/load_path_cache/loaded_features_index.rb:92:in `register'
	12: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/bootsnap-1.4.4/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:22:in `block in require_with_bootsnap_lfi'
	11: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/bootsnap-1.4.4/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:22:in `require'
	10: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/railties-6.0.0/lib/rails/commands.rb:18:in `<main>'
	 9: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/railties-6.0.0/lib/rails/command.rb:46:in `invoke'
	 8: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/railties-6.0.0/lib/rails/command/base.rb:65:in `perform'
	 7: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
	 6: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
	 5: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
	 4: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/railties-6.0.0/lib/rails/commands/runner/runner_command.rb:42:in `perform'
	 3: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/bootsnap-1.4.4/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:52:in `load'
	 2: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/bootsnap-1.4.4/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:69:in `rescue in load'
	 1: from /var/www/discourse/vendor/bundle/ruby/2.6.0/gems/bootsnap-1.4.4/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:69:in `load'
make_me_dev.rb:1:in `<main>': uninitialized constant Developers (NameError)
discourse@2d5c3bf04550:~$ cat make_me_dev.rb
Developers.create!(user_id: 123)

I’ll try the discourse.conf - way and provide the requested data as soon as possible.

Thanks for your advice so far!

1 « J'aime »

Well, that is way we always tell people to use the official supported method of installing Discourse, so when help is needed we can help out using standard tools.

3 « J'aime »

Yeah, may it’s because we’re using supervisord to spawn all processes.

Anyway, here is the output you’ve requested (zipped html-content):
_u_x_strom_summary.json-13498.2ms-ProfilingResults.zip (94.5 KB)

This request took 13secs but I had to load the site twice. Again the first attempt took to long so it resulted in a 502 Error. So maybe some thing were cached already.

1 « J'aime »

It took more than 13 seconds to run 51 SQL queries. Your database is seriously either misconfigured or underpowered.

Big offenders
SELECT "posts"."id" AS t0_r0, "posts"."user_id" AS t0_r1, "posts"."topic_id" AS t0_r2, "posts"."post_number" AS t0_r3, "posts"."raw" AS t0_r4, "posts"."cooked" AS t0_r5, "posts"."created_at" AS t0_r6, "posts"."updated_at" AS t0_r7, "posts"."reply_to_post_number" AS t0_r8, "posts"."reply_count" AS t0_r9, "posts"."quote_count" AS t0_r10, "posts"."deleted_at" AS t0_r11, "posts"."off_topic_count" AS t0_r12, "posts"."like_count" AS t0_r13, "posts"."incoming_link_count" AS t0_r14, "posts"."bookmark_count" AS t0_r15, "posts"."avg_time" AS t0_r16, "posts"."score" AS t0_r17, "posts"."reads" AS t0_r18, "posts"."post_type" AS t0_r19, "posts"."sort_order" AS t0_r20, "posts"."last_editor_id" AS t0_r21, "posts"."hidden" AS t0_r22, "posts"."hidden_reason_id" AS t0_r23, "posts"."notify_moderators_count" AS t0_r24, "posts"."spam_count" AS t0_r25, "posts"."illegal_count" AS t0_r26, "posts"."inappropriate_count" AS t0_r27, "posts"."last_version_at" AS t0_r28, "posts"."user_deleted" AS t0_r29, "posts"."reply_to_user_id" AS t0_r30, "posts"."percent_rank" AS t0_r31, "posts"."notify_user_count" AS t0_r32, "posts"."like_score" AS t0_r33, "posts"."deleted_by_id" AS t0_r34, "posts"."edit_reason" AS t0_r35, "posts"."word_count" AS t0_r36, "posts"."version" AS t0_r37, "posts"."cook_method" AS t0_r38, "posts"."wiki" AS t0_r39, "posts"."baked_at" AS t0_r40, "posts"."baked_version" AS t0_r41, "posts"."hidden_at" AS t0_r42, "posts"."self_edits" AS t0_r43, "posts"."reply_quoted" AS t0_r44, "posts"."via_email" AS t0_r45, "posts"."raw_email" AS t0_r46, "posts"."public_version" AS t0_r47, "posts"."action_code" AS t0_r48, "posts"."image_url" AS t0_r49, "posts"."locked_by_id" AS t0_r50, "topics"."id" AS t1_r0, "topics"."title" AS t1_r1, "topics"."last_posted_at" AS t1_r2, "topics"."created_at" AS t1_r3, "topics"."updated_at" AS t1_r4, "topics"."views" AS t1_r5, "topics"."posts_count" AS t1_r6, "topics"."user_id" AS t1_r7, "topics"."last_post_user_id" AS t1_r8, "topics"."reply_count" AS t1_r9, "topics"."featured_user1_id" AS t1_r10, "topics"."featured_user2_id" AS t1_r11, "topics"."featured_user3_id" AS t1_r12, "topics"."avg_time" AS t1_r13, "topics"."deleted_at" AS t1_r14, "topics"."highest_post_number" AS t1_r15, "topics"."image_url" AS t1_r16, "topics"."like_count" AS t1_r17, "topics"."incoming_link_count" AS t1_r18, "topics"."category_id" AS t1_r19, "topics"."visible" AS t1_r20, "topics"."moderator_posts_count" AS t1_r21, "topics"."closed" AS t1_r22, "topics"."archived" AS t1_r23, "topics"."bumped_at" AS t1_r24, "topics"."has_summary" AS t1_r25, "topics"."archetype" AS t1_r26, "topics"."featured_user4_id" AS t1_r27, "topics"."notify_moderators_count" AS t1_r28, "topics"."spam_count" AS t1_r29, "topics"."pinned_at" AS t1_r30, "topics"."score" AS t1_r31, "topics"."percent_rank" AS t1_r32, "topics"."subtype" AS t1_r33, "topics"."slug" AS t1_r34, "topics"."deleted_by_id" AS t1_r35, "topics"."participant_count" AS t1_r36, "topics"."word_count" AS t1_r37, "topics"."excerpt" AS t1_r38, "topics"."pinned_globally" AS t1_r39, "topics"."pinned_until" AS t1_r40, "topics"."fancy_title" AS t1_r41, "topics"."highest_staff_post_number" AS t1_r42, "topics"."featured_link" AS t1_r43, "topics"."reviewable_score" AS t1_r44 FROM "posts" INNER JOIN "topics" ON ("topics"."deleted_at" IS NULL) AND "topics"."id" = "posts"."topic_id" WHERE ("posts"."deleted_at" IS NULL) AND (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (1,3,6,100,141,202,229,233,235,248,262,594,595,609))) AND "posts"."user_id" = 643476 AND (post_number > 1) ORDER BY posts.like_count DESC, posts.created_at DESC LIMIT 6; 

Took 6248.3 ms

SELECT "topic_links"."id" AS t0_r0, "topic_links"."topic_id" AS t0_r1, "topic_links"."post_id" AS t0_r2, "topic_links"."user_id" AS t0_r3, "topic_links"."url" AS t0_r4, "topic_links"."domain" AS t0_r5, "topic_links"."internal" AS t0_r6, "topic_links"."link_topic_id" AS t0_r7, "topic_links"."created_at" AS t0_r8, "topic_links"."updated_at" AS t0_r9, "topic_links"."reflection" AS t0_r10, "topic_links"."clicks" AS t0_r11, "topic_links"."link_post_id" AS t0_r12, "topic_links"."title" AS t0_r13, "topic_links"."crawled_at" AS t0_r14, "topic_links"."quote" AS t0_r15, "topic_links"."extension" AS t0_r16, "topics"."id" AS t1_r0, "topics"."title" AS t1_r1, "topics"."last_posted_at" AS t1_r2, "topics"."created_at" AS t1_r3, "topics"."updated_at" AS t1_r4, "topics"."views" AS t1_r5, "topics"."posts_count" AS t1_r6, "topics"."user_id" AS t1_r7, "topics"."last_post_user_id" AS t1_r8, "topics"."reply_count" AS t1_r9, "topics"."featured_user1_id" AS t1_r10, "topics"."featured_user2_id" AS t1_r11, "topics"."featured_user3_id" AS t1_r12, "topics"."avg_time" AS t1_r13, "topics"."deleted_at" AS t1_r14, "topics"."highest_post_number" AS t1_r15, "topics"."image_url" AS t1_r16, "topics"."like_count" AS t1_r17, "topics"."incoming_link_count" AS t1_r18, "topics"."category_id" AS t1_r19, "topics"."visible" AS t1_r20, "topics"."moderator_posts_count" AS t1_r21, "topics"."closed" AS t1_r22, "topics"."archived" AS t1_r23, "topics"."bumped_at" AS t1_r24, "topics"."has_summary" AS t1_r25, "topics"."archetype" AS t1_r26, "topics"."featured_user4_id" AS t1_r27, "topics"."notify_moderators_count" AS t1_r28, "topics"."spam_count" AS t1_r29, "topics"."pinned_at" AS t1_r30, "topics"."score" AS t1_r31, "topics"."percent_rank" AS t1_r32, "topics"."subtype" AS t1_r33, "topics"."slug" AS t1_r34, "topics"."deleted_by_id" AS t1_r35, "topics"."participant_count" AS t1_r36, "topics"."word_count" AS t1_r37, "topics"."excerpt" AS t1_r38, "topics"."pinned_globally" AS t1_r39, "topics"."pinned_until" AS t1_r40, "topics"."fancy_title" AS t1_r41, "topics"."highest_staff_post_number" AS t1_r42, "topics"."featured_link" AS t1_r43, "topics"."reviewable_score" AS t1_r44, "posts"."id" AS t2_r0, "posts"."user_id" AS t2_r1, "posts"."topic_id" AS t2_r2, "posts"."post_number" AS t2_r3, "posts"."raw" AS t2_r4, "posts"."cooked" AS t2_r5, "posts"."created_at" AS t2_r6, "posts"."updated_at" AS t2_r7, "posts"."reply_to_post_number" AS t2_r8, "posts"."reply_count" AS t2_r9, "posts"."quote_count" AS t2_r10, "posts"."deleted_at" AS t2_r11, "posts"."off_topic_count" AS t2_r12, "posts"."like_count" AS t2_r13, "posts"."incoming_link_count" AS t2_r14, "posts"."bookmark_count" AS t2_r15, "posts"."avg_time" AS t2_r16, "posts"."score" AS t2_r17, "posts"."reads" AS t2_r18, "posts"."post_type" AS t2_r19, "posts"."sort_order" AS t2_r20, "posts"."last_editor_id" AS t2_r21, "posts"."hidden" AS t2_r22, "posts"."hidden_reason_id" AS t2_r23, "posts"."notify_moderators_count" AS t2_r24, "posts"."spam_count" AS t2_r25, "posts"."illegal_count" AS t2_r26, "posts"."inappropriate_count" AS t2_r27, "posts"."last_version_at" AS t2_r28, "posts"."user_deleted" AS t2_r29, "posts"."reply_to_user_id" AS t2_r30, "posts"."percent_rank" AS t2_r31, "posts"."notify_user_count" AS t2_r32, "posts"."like_score" AS t2_r33, "posts"."deleted_by_id" AS t2_r34, "posts"."edit_reason" AS t2_r35, "posts"."word_count" AS t2_r36, "posts"."version" AS t2_r37, "posts"."cook_method" AS t2_r38, "posts"."wiki" AS t2_r39, "posts"."baked_at" AS t2_r40, "posts"."baked_version" AS t2_r41, "posts"."hidden_at" AS t2_r42, "posts"."self_edits" AS t2_r43, "posts"."reply_quoted" AS t2_r44, "posts"."via_email" AS t2_r45, "posts"."raw_email" AS t2_r46, "posts"."public_version" AS t2_r47, "posts"."action_code" AS t2_r48, "posts"."image_url" AS t2_r49, "posts"."locked_by_id" AS t2_r50 FROM "topic_links" INNER JOIN "topics" ON ("topics"."deleted_at" IS NULL) AND "topics"."id" = "topic_links"."topic_id" INNER JOIN "posts" ON ("posts"."deleted_at" IS NULL) AND "posts"."id" = "topic_links"."post_id" WHERE (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (1,3,6,100,141,202,229,233,235,248,262,594,595,609))) AND "topic_links"."user_id" = 643476 AND "topic_links"."internal" = FALSE AND "topic_links"."reflection" = FALSE AND "topic_links"."quote" = FALSE ORDER BY clicks DESC, topic_links.created_at DESC LIMIT 6; 

Took 258.3 ms

SELECT acting_user_id, COUNT(*) FROM "user_actions" INNER JOIN "topics" ON ("topics"."deleted_at" IS NULL) AND "topics"."id" = "user_actions"."target_topic_id" INNER JOIN "posts" ON ("posts"."deleted_at" IS NULL) AND "posts"."id" = "user_actions"."target_post_id" WHERE ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (1,3,6,100,141,202,229,233,235,248,262,594,595,609))) AND "user_actions"."user_id" = 643476 AND "user_actions"."action_type" = 2 GROUP BY "user_actions"."acting_user_id" ORDER BY COUNT(*) DESC LIMIT 6; 

Took 245.5 ms

SELECT user_actions.user_id, COUNT(*) FROM "user_actions" INNER JOIN "topics" ON ("topics"."deleted_at" IS NULL) AND "topics"."id" = "user_actions"."target_topic_id" INNER JOIN "posts" ON ("posts"."deleted_at" IS NULL) AND "posts"."id" = "user_actions"."target_post_id" WHERE ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (1,3,6,100,141,202,229,233,235,248,262,594,595,609))) AND "user_actions"."action_type" = 2 AND "user_actions"."acting_user_id" = 643476 GROUP BY "user_actions"."user_id" ORDER BY COUNT(*) DESC LIMIT 6; 

Took 277.2 ms

SELECT replies.user_id, COUNT(*) FROM "posts" INNER JOIN "topics" ON ("topics"."deleted_at" IS NULL) AND "topics"."id" = "posts"."topic_id" JOIN posts replies ON posts.topic_id = replies.topic_id AND posts.reply_to_post_number = replies.post_number WHERE ("posts"."deleted_at" IS NULL) AND (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (1,3,6,100,141,202,229,233,235,248,262,594,595,609))) AND "posts"."user_id" = 643476 AND (replies.user_id <> 643476) GROUP BY replies.user_id ORDER BY COUNT(*) DESC LIMIT 6; 

Took 3064.6 ms

SELECT category_id FROM "posts" INNER JOIN "topics" ON ("topics"."deleted_at" IS NULL) AND "topics"."id" = "posts"."topic_id" WHERE ("posts"."deleted_at" IS NULL) AND (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (1,3,6,100,141,202,229,233,235,248,262,594,595,609))) AND "posts"."user_id" = 643476 GROUP BY topics.category_id ORDER BY COUNT(*) DESC LIMIT 6; 

Took 2283.3 ms

SELECT category_id, COUNT(*) FROM "posts" INNER JOIN "topics" ON ("topics"."deleted_at" IS NULL) AND "topics"."id" = "posts"."topic_id" WHERE ("posts"."deleted_at" IS NULL) AND (posts.post_type IN (1,2,3,4)) AND ("topics"."deleted_at" IS NULL) AND (topics.archetype <> 'private_message') AND "topics"."visible" = TRUE AND (topics.category_id IS NULL OR topics.category_id IN (SELECT id FROM categories WHERE NOT read_restricted OR id IN (1,3,6,100,141,202,229,233,235,248,262,594,595,609))) AND "posts"."user_id" = 643476 AND (post_number > 1) AND (topics.category_id in (64,13,59,124,327,122)) GROUP BY topics.category_id ORDER BY COUNT(*) DESC; 

Took 811.5 ms

I assume Azure has some sort of database statistics for the hosted service, you gotta check those and evaluate what need to change.

Also check the /sidekiq URL on your install to see if there is no queue contention keeping the database busy.

3 « J'aime »

That’s what we’ve checked first. Our instance has

  • 8 cores (~16.18% utilisation, spikes to 60%)
  • 40GB RAM (~60% used, no spiking)
  • 2244 available IOPS (~2.5% utilisation, spikes to 24%)

As said, we’ve used the configuration described in your postgres.10.template.yml . Everything else is at its default. Seems we should take another look into it.

Sidekiq is looking good at the moment. 0 busy, 0 enqueued, 0 retries, 1 scheduled, 0 dead. Job-list is currently empty.

So to come up with a first conclusion. At first we’ll check the postgresql-config as there are some tweaks we could apply. If it won’t work we should consider to upgrade the postgressql-instance and/or perform a full vacuum.

Thanks for your time so far and have a pleasant weekend!
Kind Regards
Sascha

2 « J'aime »

If you have 40GB RAM on the machine, the defaults in the postgres.10.template.yml are no longer appropriate, as samples/standalone.yml points out:

4 « J'aime »

Hi,

thanks for your reply. There are some limitations using a PostgreSQL - Instance on Azure. You can’t set some settings like effective_cache_size, max_connections, maintenance_worker_mem or shared_buffers.
At the time of writing this support-request we’ve used the “General Purpose” - Tier which gave us the 40GB RAM but only 1GB of shared_buffers and 2.5GB of effective_cache_size.
Since yesterday we’ve switched to “Memory Optimized” which doubled the RAM to 80GB, 2GB shared_buffers and 5GB effective_cache_size.
work_mem, as a setting we are able to adjust, is currrently set to 128MB as we have no temporary file entries in our logs.

The interesting thing was that as soon as we switched the tier, CPU and IOPS utilisation dropped without increasing the vCores and storage. So we definitely had a problem caused by caches and buffers which were to small.

We still have long loading times showing summary-pages of some users (especially staff-users/long-term users) or the about - page but it’s way better than before (~50% of the prior loading times).

Kind Regards

P.S.
For those who are facing similar issues and have more control over their postgresql installation may these small guides could be helpful:
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
Architecture and Tuning of Memory in PostgreSQL Databases | Severalnines

5 « J'aime »

Good to hear. Next steps would be to see if you can get away with using a 4vCPU Memory-Optimized as a price optimization, or switch to running it yourself.

1 « J'aime »

Oh, we’re running 8 vCores which results in 80GB ram. shared_buffers are at 2GB but should be at 16GB for this tier and vcore- count. But we’re in touch with the Azure Support to get that fixed.

P.S.
We’re already on 16GB share_buffer. I’ve misinterpreted the output of SELECT * FROM pg_settings WHERE name = 'shared_buffers';. Unit is 8kb.

So I think this topic can be closed. I thank you all for your effort and hope we can achieve some better performance in the next days together with the azure support.

P.P.S.
Sorry @riking, I got you wrong. I think we are good to go with 8 vCores instead of decreasing it to 4. May we give it a try as soon as we’ve achieved the performance we’re after.

1 « J'aime »