Database seems inexplicably large

tl;dr: it looks like there are about 10GB of data reported by rake db:stats, but postgres_data is taking up 18GB. Does it seem likely that there are 8GB of indexes taking up the rest of the space, or is there some kind of cleanup that needs to be done?

Longer version of the same question

The postgres data directory is 18GB. A dump of the database is about 5GB.

var/discourse# du -hs shared/data/postgres_data/
18G     shared/data/postgres_data/

Here’s what rake db:stats reveals. I truncated the rows reported in kB (about 10GB).

table_name row_estimate size
post_timings 49643076 3011 MB
posts 2289502 990 MB
user_actions 5506773 915 MB
post_search_data 2136003 875 MB
email_logs 2934363 717 MB
topic_views 3498593 507 MB
topic_users 3714046 448 MB
post_stats 1894296 319 MB
notifications 1178136 296 MB
post_replies 1077331 127 MB
topics 146814 107 MB
post_actions 456903 63 MB
draft_sequences 671770 54 MB
post_revisions 74596 51 MB
user_profile_views 465818 49 MB
topic_search_data 124471 46 MB
topic_links 160437 33 MB
scheduler_stats 168878 29 MB
uploads 82100 22 MB
post_custom_fields 35255 22 MB
top_topics 106478 21 MB
unsubscribe_keys 109936 19 MB
optimized_images 67107 18 MB
user_histories 77643 13 MB
topic_link_clicks 124501 12 MB
1 Like

You may have a restore schema with tables, try list those with psql.

Thanks. Is this what I’m looking for? I don’t think I see a restore schema, but I’m not quite sure what I’m looking for.

                List of relations
Schema Name Type Owner
public api_keys table discourse
public application_requests table discourse
public ar_internal_metadata table discourse
public badge_groupings table discourse
public badge_types table discourse
public badges table discourse
public categories table discourse
public categories_web_hooks table discourse
public category_custom_fields table discourse
public category_featured_topics table discourse
public category_featured_users table discourse
public category_groups table discourse
public category_search_data table discourse
public category_tag_groups table discourse
public category_tags table discourse
public category_users table discourse
public child_themes table discourse
public color_scheme_colors table discourse
public color_schemes table discourse
public custom_emojis table discourse
public developers table discourse
public directory_items table discourse
public draft_sequences table discourse
public drafts table discourse
public email_change_requests table discourse
public email_logs table discourse
public email_tokens table discourse
public embeddable_hosts table discourse
public facebook_user_infos table discourse
public github_user_infos table discourse
public given_daily_likes table discourse
public google_user_infos table discourse
public group_archived_messages table discourse
public group_custom_fields table discourse
public group_histories table discourse
public group_mentions table discourse
public group_users table discourse
public groups table discourse
public groups_web_hooks table discourse
public incoming_domains table discourse
public incoming_emails table discourse
public incoming_links table discourse
public incoming_referers table discourse
public instagram_user_infos table discourse
public invited_groups table discourse
public invites table discourse
public message_bus table discourse
public muted_users table discourse
public notifications table discourse
public oauth2_user_infos table discourse
public onceoff_logs table discourse
public optimized_images table discourse
public permalinks table discourse
public plugin_store_rows table discourse
public post_action_types table discourse
public post_actions table discourse
public post_custom_fields table discourse
public post_details table discourse
public post_replies table discourse
public post_revisions table discourse
public post_search_data table discourse
public post_stats table discourse
public post_timings table discourse
public post_uploads table discourse
public posts table discourse
public queued_posts table discourse
public quoted_posts table discourse
public remote_themes table discourse
public scheduler_stats table discourse
public schema_migration_details table discourse
public schema_migrations table discourse
public screened_emails table discourse
public screened_ip_addresses table discourse
public screened_urls table discourse
public search_logs table discourse
public single_sign_on_records table discourse
public site_settings table discourse
public stylesheet_cache table discourse
public tag_group_memberships table discourse
public tag_groups table discourse
public tag_search_data table discourse
public tag_users table discourse
public tags table discourse
public theme_fields table discourse
public themes table discourse
public top_topics table discourse
public topic_allowed_groups table discourse
public topic_allowed_users table discourse
public topic_custom_fields table discourse
public topic_embeds table discourse
public topic_invites table discourse
public topic_link_clicks table discourse
public topic_links table discourse
public topic_search_data table discourse
public topic_tags table discourse
public topic_timers table discourse
public topic_users table discourse
public topic_views table discourse
public topics table discourse
public translation_overrides table discourse
public twitter_user_infos table discourse
public unsubscribe_keys table discourse
public uploads table discourse
public user_actions table discourse
public user_api_keys table discourse
public user_archived_messages table discourse
public user_auth_token_logs table discourse
public user_auth_tokens table discourse
public user_avatars table discourse
public user_badges table discourse
public user_custom_fields table discourse
public user_emails table discourse
public user_exports table discourse
public user_field_options table discourse
public user_fields table discourse
public user_histories table discourse
public user_open_ids table discourse
public user_options table discourse
public user_profile_views table discourse
public user_profiles table discourse
public user_search_data table discourse
public user_stats table discourse
public user_visits table discourse
public user_warnings table discourse
public users table discourse
public versions table discourse
public watched_words table discourse
public web_hook_event_types table discourse
public web_hook_event_types_hooks table discourse
public web_hook_events table discourse
public web_hooks table discourse

(131 rows)

1 Like

Is that giant inline table really necessary?

1 Like

Our Discourse post-timings table size is about 17GB. Is this normal?

table_name row_estimate table_size index_size total_size
post_timings 120967448 6676 MB 10 GB 17 GB
posts 3616224 2413 MB 3219 MB 5632 MB
user_actions 15159571 1215 MB 3699 MB 4914 MB
post_search_data 3531969 3756 MB 1031 MB 4787 MB
notifications 2241273 687 MB 824 MB 1510 MB
topic_views 7696258 369 MB 1110 MB 1478 MB
post_actions 3475097 350 MB 618 MB 968 MB
topic_users 3343623 365 MB 257 MB 622 MB
user_auth_token_logs 1123545 410 MB 110 MB 520 MB
post_stats 2779471 354 MB 155 MB 509 MB
incoming_links 3154654 219 MB 251 MB 469 MB
topics 64301 142 MB 286 MB 429 MB
post_replies 2327812 220 MB 120 MB 340 MB
user_visits 289994 17 MB 229 MB 246 MB
stylesheet_cache 4605 204 MB 840 kB 205 MB
top_topics 40202 19 MB 148 MB 167 MB
user_profile_views 593591 61 MB 97 MB 158 MB
post_revisions 102832 87 MB 8072 kB 95 MB

Yes, that is the largest table in any active Discourse site, with one of the indexes on it taking most of the space.

We will rollout an update to PostgreSQL this year that will make it take half that space, so stay tuned :wink:

8 Likes