Improving Instance Performance (Megatopics, Database Size and Extreme Load)

Hi Discourse Community!

Recently I have been trying to improve the performance of my Discourse installation and clean up a little since my site has been growing exponentially lately.

I have identified two issues, searched about them here but I don’t see a clear answer for some specifics, so I hope that this doesn’t represent much of a bother to anyone.

The first issue I have is the DB Size, which is pretty big. I’m running a 4GB Mem/80GB Disk instance on Digital Ocean and the DB is choking the disk size already. I do believe that eventually I will have to move it (if so, how?) but maybe I’m doing something wrong, following Sam’s Help I exported the following to make it easier:


table_name                  | row_estimate | table_size | index_size | total_size
--------------------------------------------------------------------------
post_timings                | 155307152    | 8004 MB    | 16 GB      | 24 GB
posts                       | 2257277      | 2432 MB    | 4810 MB    | 7242 MB
post_search_data            | 2279749      | 1992 MB    | 769 MB     | 2761 MB
user_actions                | 6549714      | 570 MB     | 2189 MB    | 2759 MB
topic_views                 | 8843734      | 444 MB     | 1494 MB    | 1937 MB
user_visits                 | 569317       | 33 MB      | 1892 MB    | 1925 MB
notifications               | 1482664      | 465 MB     | 914 MB     | 1379 MB
topic_users                 | 4821392      | 500 MB     | 449 MB     | 949 MB
top_topics                  | 47437        | 28 MB      | 773 MB     | 802 MB
user_auth_token_logs        | 1690555      | 515 MB     | 133 MB     | 648 MB
post_actions                | 1610428      | 145 MB     | 367 MB     | 512 MB
post_revisions              | 113187       | 396 MB     | 9312 kB    | 406 MB
topic_links                 | 605525       | 135 MB     | 254 MB     | 389 MB
topics                      | 56970        | 104 MB     | 227 MB     | 331 MB
web_hook_events             | 107760       | 295 MB     | 10 MB      | 306 MB
post_stats                  | 1955191      | 151 MB     | 97 MB      | 248 MB
directory_items             | 13026        | 1312 kB    | 157 MB     | 158 MB
incoming_links              | 812553       | 73 MB      | 82 MB      | 155 MB
post_replies                | 1111686      | 69 MB      | 71 MB      | 139 MB
topic_link_clicks           | 806821       | 54 MB      | 43 MB      | 97 MB
draft_sequences             | 654989       | 36 MB      | 48 MB      | 84 MB
topic_search_data           | 54056        | 40 MB      | 24 MB      | 65 MB
stylesheet_cache            | 901          | 57 MB      | 200 kB     | 57 MB
user_profile_views          | 204427       | 15 MB      | 30 MB      | 46 MB
quoted_posts                | 223337       | 18 MB      | 23 MB      | 41 MB
poll_votes                  | 142349       | 13 MB      | 20 MB      | 32 MB
users                       | 2211         | 2360 kB    | 29 MB      | 32 MB
given_daily_likes           | 252806       | 12 MB      | 16 MB      | 28 MB
scheduler_stats             | 115081       | 17 MB      | 4736 kB    | 21 MB
user_histories              | 30331        | 8848 kB    | 10040 kB   | 18 MB
reviewables                 | 16263        | 10032 kB   | 8344 kB    | 18 MB
optimized_images            | 34463        | 8088 kB    | 10 MB      | 18 MB
post_uploads                | 73123        | 4104 kB    | 13 MB      | 17 MB
uploads                     | 18897        | 5088 kB    | 9080 kB    | 14 MB
email_logs                  | 23224        | 4024 kB    | 9960 kB    | 14 MB
post_custom_fields          | 11043        | 3192 kB    | 9328 kB    | 12 MB
search_logs                 | 68429        | 7480 kB    | 4776 kB    | 12 MB
user_badges                 | 37176        | 2920 kB    | 5008 kB    | 7928 kB
unsubscribe_keys            | 14820        | 3352 kB    | 4480 kB    | 7832 kB
user_auth_tokens            | 5328         | 2536 kB    | 3608 kB    | 6144 kB
reviewable_scores           | 14681        | 3144 kB    | 2768 kB    | 5912 kB
reviewable_histories        | 31482        | 2976 kB    | 2616 kB    | 5592 kB
poll_options                | 20886        | 2560 kB    | 2552 kB    | 5112 kB
skipped_email_logs          | 11164        | 2528 kB    | 2328 kB    | 4856 kB
topic_allowed_users         | 21933        | 1424 kB    | 1872 kB    | 3296 kB
user_uploads                | 19038        | 1040 kB    | 1688 kB    | 2728 kB
user_stats                  | 2211         | 1888 kB    | 160 kB     | 2048 kB
drafts                      | 1324         | 1424 kB    | 368 kB     | 1792 kB
user_custom_fields          | 7467         | 688 kB     | 1064 kB    | 1752 kB
application_requests        | 11244        | 792 kB     | 528 kB     | 1320 kB
topic_tags                  | 10257        | 696 kB     | 528 kB     | 1224 kB
user_associated_accounts    | 670          | 1032 kB    | 184 kB     | 1216 kB
user_profiles               | 2211         | 424 kB     | 720 kB     | 1144 kB
email_tokens                | 3439         | 480 kB     | 528 kB     | 1008 kB
polls                       | 4030         | 520 kB     | 408 kB     | 928 kB
user_search_data            | 2215         | 376 kB     | 520 kB     | 896 kB
topic_custom_fields         | 2738         | 280 kB     | 568 kB     | 848 kB
group_users                 | 4364         | 344 kB     | 448 kB     | 792 kB
plugin_store_rows           | 2090         | 488 kB     | 296 kB     | 784 kB
incoming_referers           | 3779         | 352 kB     | 424 kB     | 776 kB
user_avatars                | 2210         | 208 kB     | 560 kB     | 768 kB
web_crawler_requests        | 1389         | 264 kB     | 440 kB     | 704 kB
group_histories             | 2210         | 272 kB     | 416 kB     | 688 kB
user_emails                 | 2218         | 224 kB     | 376 kB     | 600 kB
user_archived_messages      | 3019         | 240 kB     | 232 kB     | 472 kB
user_options                | 2218         | 384 kB     | 72 kB      | 456 kB
topic_allowed_groups        | 2098         | 128 kB     | 216 kB     | 344 kB
schema_migration_details    | 994          | 192 kB     | 88 kB      | 280 kB
group_mentions              | 933          | 104 kB     | 152 kB     | 256 kB
categories                  | 23           | 96 kB      | 112 kB     | 208 kB
google_user_infos           | 314          | 136 kB     | 72 kB      | 208 kB
theme_fields                | 24           | 168 kB     | 32 kB      | 200 kB
category_users              | 569          | 64 kB      | 136 kB     | 200 kB
javascript_caches           | 8            | 112 kB     | 64 kB      | 176 kB
incoming_domains            | 701          | 80 kB      | 96 kB      | 176 kB
groups                      | 51           | 120 kB     | 48 kB      | 168 kB
category_tag_stats          | 173          | 48 kB      | 104 kB     | 152 kB
tag_search_data             | 109          | 64 kB      | 72 kB      | 136 kB
schema_migrations           | 994          | 88 kB      | 48 kB      | 136 kB
topic_embeds                | 218          | 80 kB      | 56 kB      | 136 kB
badges                      | 51           | 80 kB      | 48 kB      | 128 kB
translation_overrides       | 170          | 72 kB      | 48 kB      | 120 kB
invites                     | 21           | 56 kB      | 64 kB      | 120 kB
user_api_keys               | 4            | 48 kB      | 64 kB      | 112 kB
category_search_data        | 20           | 48 kB      | 64 kB      | 112 kB
tags                        | 109          | 56 kB      | 48 kB      | 104 kB
screened_ip_addresses       | 9            | 48 kB      | 48 kB      | 96 kB
user_second_factors         | 26           | 48 kB      | 48 kB      | 96 kB
oauth2_user_infos           | 4            | 48 kB      | 48 kB      | 96 kB
site_settings               | 165          | 64 kB      | 32 kB      | 96 kB
api_keys                    | 1            | 48 kB      | 48 kB      | 96 kB
category_featured_topics    | 123          | 48 kB      | 48 kB      | 96 kB
screened_emails             | 4            | 48 kB      | 48 kB      | 96 kB
screened_urls               | 1            | 48 kB      | 48 kB      | 96 kB
topic_groups                | 245          | 56 kB      | 32 kB      | 88 kB
muted_users                 | 103          | 40 kB      | 48 kB      | 88 kB
tag_group_permissions       | 11           | 40 kB      | 48 kB      | 88 kB
tag_users                   | 8            | 40 kB      | 48 kB      | 88 kB
child_themes                | 6            | 40 kB      | 48 kB      | 88 kB
category_tags               | 9            | 40 kB      | 48 kB      | 88 kB
topic_timers                | 15           | 40 kB      | 48 kB      | 88 kB
ignored_users               | 10           | 40 kB      | 48 kB      | 88 kB
group_requests              | 0            | 24 kB      | 64 kB      | 88 kB
user_warnings               | 7            | 40 kB      | 48 kB      | 88 kB
email_change_requests       | 64           | 56 kB      | 32 kB      | 88 kB
web_hooks                   | 1            | 72 kB      | 16 kB      | 88 kB
custom_emojis               | 132          | 56 kB      | 32 kB      | 88 kB
color_scheme_colors         | 110          | 56 kB      | 32 kB      | 88 kB
tag_group_memberships       | 192          | 48 kB      | 32 kB      | 80 kB
category_custom_fields      | 17           | 48 kB      | 32 kB      | 80 kB
themes                      | 10           | 48 kB      | 32 kB      | 80 kB
badge_types                 | 3            | 48 kB      | 32 kB      | 80 kB
onceoff_logs                | 39           | 48 kB      | 32 kB      | 80 kB
category_tag_groups         | 8            | 40 kB      | 32 kB      | 72 kB
group_archived_messages     | 56           | 40 kB      | 32 kB      | 72 kB
category_groups             | 3            | 40 kB      | 32 kB      | 72 kB
push_subscriptions          | 12           | 48 kB      | 16 kB      | 64 kB
tag_groups                  | 10           | 48 kB      | 16 kB      | 64 kB
theme_settings              | 6            | 48 kB      | 16 kB      | 64 kB
ar_internal_metadata        | 1            | 48 kB      | 16 kB      | 64 kB
backup_metadata             | 6            | 48 kB      | 16 kB      | 64 kB
user_fields                 | 9            | 48 kB      | 16 kB      | 64 kB
remote_themes               | 7            | 48 kB      | 16 kB      | 64 kB
badge_groupings             | 5            | 48 kB      | 16 kB      | 64 kB
web_hook_event_types        | 10           | 48 kB      | 16 kB      | 64 kB
user_security_keys          | 0            | 8192 bytes | 56 kB      | 64 kB
color_schemes               | 11           | 48 kB      | 16 kB      | 64 kB
permalinks                  | 0            | 24 kB      | 32 kB      | 56 kB
incoming_emails             | 0            | 8192 bytes | 48 kB      | 56 kB
post_action_types           | 8            | 40 kB      | 16 kB      | 56 kB
web_hook_event_types_hooks  | 1            | 40 kB      | 16 kB      | 56 kB
watched_words               | 0            | 24 kB      | 32 kB      | 56 kB
user_exports                | 0            | 24 kB      | 16 kB      | 40 kB
github_user_infos           | 0            | 8192 bytes | 24 kB      | 32 kB
backup_draft_posts          | 0            | 8192 bytes | 24 kB      | 32 kB
categories_web_hooks        | 0            | 16 kB      | 16 kB      | 32 kB
theme_translation_overrides | 0            | 8192 bytes | 24 kB      | 32 kB
single_sign_on_records      | 0            | 8192 bytes | 24 kB      | 32 kB
post_reply_keys             | 0            | 0 bytes    | 24 kB      | 24 kB
backup_draft_topics         | 0            | 0 bytes    | 24 kB      | 24 kB
user_open_ids               | 0            | 8192 bytes | 16 kB      | 24 kB
post_details                | 0            | 8192 bytes | 16 kB      | 24 kB
message_bus                 | 0            | 8192 bytes | 16 kB      | 24 kB
anonymous_users             | 0            | 0 bytes    | 24 kB      | 24 kB
group_custom_fields         | 0            | 8192 bytes | 16 kB      | 24 kB
topic_invites               | 0            | 0 bytes    | 24 kB      | 24 kB
shared_drafts               | 0            | 0 bytes    | 24 kB      | 24 kB
instagram_user_infos        | 0            | 8192 bytes | 8192 bytes | 16 kB
reviewable_claimed_topics   | 0            | 0 bytes    | 16 kB      | 16 kB
user_field_options          | 0            | 8192 bytes | 8192 bytes | 16 kB
embeddable_hosts            | 0            | 8192 bytes | 8192 bytes | 16 kB
invited_groups              | 0            | 0 bytes    | 8192 bytes | 8192 bytes
developers                  | 0            | 0 bytes    | 8192 bytes | 8192 bytes
tags_web_hooks              | 0            | 0 bytes    | 8192 bytes | 8192 bytes
groups_web_hooks            | 0            | 0 bytes    | 8192 bytes | 8192 bytes
badge_posts                 | 0            | 0 bytes    | 0 bytes    | 0 bytes

(on the other hand, the /var/discourse/shared/standalone/postgres_data/base folder is over 47GB)

Can it be reduced somehow or is it logical given the size of some topics? (more details below).

The other issue maaaay be related, I believe, which is that I keep getting the famous " Due to extreme load, this is temporarily being shown to everyone as a logged out user would see it" I read on a post that increasing Unicorn Workers may be an option (never touched that, honestly havent found how to do it or how viable it is without endangering the whole installation). As a note we do have quite some topics with more than 10k answers, so that may be a correlation between DB Size and Performance issues? (Throwing it out, not sure).

I wonder if there is some way of optimizing it (I found this one), either on the very same instance or recurring to something like a HA installation (which I don’t know if it is supported), but I do believe that I’m jumping the gun in my ignorance. I do appreciate any help I could get.

As a note: I was using a swap file when the instance was smaller, I do believe it is disabled now (is there a way of verifying this?).

Thanks and sorry for the n00bness.

5 Likes

That post_timings table is a monster. Is there anything we can do to truncate or “abbreviate” or “summarize” this table without breaking stuff @sam?

6 Likes

Mini-Update: Tried to tweak around the Unicorn Runners Settings but I don’t really know if there is a core-per-worker relation or is like a shared CPU kind of thing like CPU % x Second, is there a best practice for this?

Also, have been reading about PostgreSQL to know how to get around that size but, even though I’m backed up, don’t know how the app will react in the long run if I just go around snipping.

Quick Edit: Removed the Swap File and I’m now analyzing how to proceed with the 10K+ topics, which I use the opportunity to ask how much of a danger they are to the bigger picture (I assume there is some given the recommendation, but if there is something else I would like to learn it if possible).

1 Like

I also have some 10k+ topics, and am also seeing this message every now and then when the site is really busy.

1 Like

I strongly advise you to not override the Discourse defaults, which auto-close topics with more than 10k replies. There’s a reason we have this setting on by default. :scream:

Beyond that, this problem of massive post_timings table is on our radar :satellite: and we are currently brainstorming ways to deal with it, perhaps even in the current 2.5 release cc @sam @eviltrout

2 Likes

I feel like it needs a triple-confirm with the final OK button reading “I'm ok with bad things happening”.

Is anyone making megatopics really work?

We have around 80 megatopics pumping the whole time (128k posts on the largest one).

Some 502 problems sometimes (maybe related to this, I can’t tell), everything goes smoothly since we’ve adjusted the db_shared_buffers parameter to something above the database size.

This is a profoundly bad idea and you should shut those topics down in favor of smaller yearly or seasonal ones. See

3 Likes

I’ve got a site with a handful of >100K post mega topics. I told them that it was a mistake, but they wanted them anyway. They are complaining about performance issues. I’m hoping that I’ll soon be able to change the value back to the 10K default.

It did cause me to learn some stuff about database optimization, so that’s a bonus. :slight_smile:

5 Likes

I know what you mean, but I also have to admit that it can be awkward to split a topic in years, specially because we have a lot of conversation around people.

Imagine a topic about Trump in general or one about the United States in general. In our forum we have a topic about one of our club’s ex-presidents, one for the president nowadays, one for each football team player. You get the thing. It’s easy to split other topics by seasons, but not these ones. Sure, it’s not impossible, but quite impracticable.

Imagine a category about Trump in general or the United States in general.

No one is going to read >10000 posts in a topic straight through. They just aren’t. At some point, you can just take a breath and start anew. I suspect that these topics are more like chat than discussion anyway, and that few people read what happened yesterday, much less last week, last month, or last year.

3 Likes

OK, but my answer to this is

… you are opting into causing yourself considerable pain here for “reasons”.

5 Likes

Here is a data explorer query that mimics the query done getting a page of posts:

-- [params]
-- int :topic_id = 107216
-- int :offset = 10000

SELECT "posts"."id" FROM "posts" 
WHERE ("posts"."deleted_at" IS NULL) 
AND "posts"."topic_id" = :topic_id
AND "posts"."post_type" IN (1,2,3) ORDER BY "posts"."sort_order" ASC LIMIT 20 
OFFSET :offset

Here’s a normal topic:

Limit  (cost=1911.35..1915.38 rows=1 width=8) 

Here is a mega topic:

Limit  (cost=37475.88..37550.83 rows=20 width=8)
4 Likes

:+1:

I believe you guys, it’s just that our users are used like this and it’ll take a bit of effort to get there… but it’s doable.

Related to this, I was just reading @codinghorror post Natural breakpoints or "chapters" for long topics? and thought that this kind of table of contents but for topics only would be a nice solution to be able to gather and have visible a sequence of topics while browsing and replying only in a single one.

Don’t misunderstand, I did that work not to convince you, but to convince my client with the > 120K post topics who is complaining about performance!

My hope is that it’ll help you too. Good luck with that. :wink:

:clinking_glasses:

3 Likes

Very helpful! Thanks for sharing!

We do have concrete near term plans to tame the post_timings table but megatopics will be a source of considerable ongoing pain for many, many years.

:warning: and remember that is after @tgxworld did great work about a year ago to reduce the overall amount of work we do on megatopics, essentially switching to “low power mode” on these topics so they don’t hurt as much.

But don’t kid yourself: megatopics still hit HARD. :boom::boxing_glove:

5 Likes

Following your recommendations and while I try to figure out the other questions (acknowledging that you guys are working on the post_timings table), I’m proceeding to restore the defaults on topic size (while apologizing for my stupidity, why not).

With this in mind, I have some questions that I hope you could answer:

  1. Once the setting is restored, those topics will be closed. New ones will be created. However, is the existence of those older big topics dangerous to the site as a whole? Meaning, should I “split them” X times into smaller topics or as long as they are not active is ok?

  2. I saw the debate that @Paracelsus raised with the commodity for the users. With this in mind, I’m going to ask if it would be possible to have a setting to “Auto-Continue Topics that are closed due to max-topic size”. That is, when a mega-topic is automatically closed, would it be possible for it to open another with the same author, same title (with a number maybe at the end of the topic?) and the only content being a link to the previous one? (While adding the link to the new one on the closed one).

I know that it is kind of convoluted and maybe not wanted for everyone (that’s why I’m rising it as an optional setting) but for high-traffic sites with relevant topics it could be useful, I guess. wdyt?

1 Like

This is like asking “should we encourage people to smoke cigarettes by auto-buying them a new pack when they run out?” :wink:

Not unless they get a lot of traffic, which they should not if they are closed, I would think? See how it goes but stopping the bleeding of the active mega topics is step zero, so you are on a good path here :+1:

3 Likes

Just a follow-up… We’re trying now to deal with these mega-topics by splitting them in chunks of 10k post each. However, the system doesn’t seem to be willing to cope with our strategy :sweat_smile::

I get a “502 Bad Gateway” when selecting at least 8000 posts at a time and moving them to a new topic (haven’t tried with smaller amounts of posts). Is there a way to increase the capacity or another/better way to accomplish this? @codinghorror @pfaffman

The solution is to do it at the rails console, but I don’t know offhand how to do that.

Something like

old_topic=1
new_topic=2
Posts.where(topic_id: old_topic).where("post_number > 10000 and post_number < 20000").update_all(topic_id: new_topic, post_number=...)

but . . . . I think that you might have to put this in a loop to create the new post numbers. You’ll want to test your solution on a staging site. If you screw something up, you’ll be in a world of hurt.

If that’s not enough for you to figure out how to do it, then you’ll probably need to post in #marketplace. But I bet no one is going to read the old posts anyway and that you should just close them and leave them huge, perhaps even deleting or unlisting them unless you think they have some SEO value.

1 Like