Slow loading on private-message-topic-tracking-state.json

Falco · September 8, 2021, 9:55pm

Oh I see, that will struggle with this query. After updating to latest, can you please extract the query from the mini_profiler and run a explain analyze on it and share with us?

On Meta (RDS db.r6g.large 30GB) it looks like this:

https://explain.depesz.com/s/El78

forkythetoy · September 8, 2021, 10:40pm

I’ll try running explain analyze again, but we haven’t been able to get that query to successfully complete.

Hooksmith · September 8, 2021, 10:46pm

Update: this commit didn’t change anything. Database still roaring at >95% constantly after update and killing all old running queries. At beta4 it used to run at a 20-30% stable cadence for this forum. We already ran an auto-vacuum and schema reindex as well.

How do we disable this feature? It doesn’t seem right to get a larger database instance to resolve this, there seems to be something wrong with how this query is computed based on the fact it ran great before this update.

forkythetoy · September 8, 2021, 11:07pm

Query managed to finish after ~21 minutes

https://explain.depesz.com/s/gs84

Also to note we are on PG 10.17 as it seems like our outputs differ quite a bit.

Hooksmith · September 9, 2021, 11:24am

We monkey-patched the issue by re-routing all */private-messages-all/* routes to just */private-messages/* temporarily. The result is that “all inboxes” has the same content as “Personal”, but at least we don’t need to deal with 100% CPU usage constantly this way.

Monkeypatch code:

# name: discourse-private-messages-perf-hotfix
# version: 0.0.1
# authors: 

# Prepend to override existing routes
Discourse::Application.routes.prepend do
  scope path: nil, constraints: { format: /(json|html|\*\/\*)/ } do
    scope "/topics", username: RouteFormat.username do
      # Reroute all */private-messages-all/* routes to go to */private-messages/* instead (personal messages)
      # Former is expensive, latter is cheap, potentially saves significant database CPU usage
      get "private-messages-all/:username" => "list#private_messages", as: "topics_private_messages_override", defaults: { format: :json }
      get "private-messages-all-sent/:username" => "list#private_messages_sent", as: "topics_private_messages_sent_override", defaults: { format: :json }
      get "private-messages-all-new/:username" => "list#private_messages_new", as: "topics_private_messages_new_override", defaults: { format: :json }
      get "private-messages-all-unread/:username" => "list#private_messages_unread", as: "topics_private_messages_unread_override", defaults: { format: :json }
      get "private-messages-all-archive/:username" => "list#private_messages_archive", as: "topics_private_messages_archive_override", defaults: { format: :json }
    end
  end
end

CPU usage of our forum database after deploying the monkeypatch above:

@tgxworld You’ll want to take another look at whatever the */private-messages-all/* routes are doing, there’s clearly something wrong with how it’s implemented, it’s not efficient enough for large forum databases. Either the implementation is not using the right database index or the query generation is resulting in extremely expensive queries (particular on PSQL 10, not sure about 12/13?).

The current implementation basically crippled our forum, increasing CPU from ~15% to constant 100% and causing slow performance on all other forum features. I don’t see any reason why Personal / group inbox queries take <50ms but the “all inboxes” ones take 20+ minutes to complete.

You can use the analyze dump @forkythetoy posted just above to see the 20m+ run before.

Hooksmith · September 9, 2021, 11:39am

@Falco I just noticed that you merged our topic here, but this actually seems to be about a different endpoint. This bug report is about private-message-topic-tracking-state endpoint, but we’re talking about */private-messages-all/*. This might be causing some of the confusion here, my apologies for that. (I initially linked to this one which might have started the mix-up)

The private-message-topic-tracking-state is fast on our forum so this is not the issue for us.

blattersturm · September 9, 2021, 1:50pm

For us this one is taking ~200-300ms of database time. A bit longer than one may expect but still within normal ranges, yeah.

We’re on Postgres 13, however.

tgxworld · September 10, 2021, 6:09am

@Hooksmith I have a fix in the pipeline

https://github.com/discourse/discourse/pull/14304

tgxworld · September 15, 2021, 6:54am

@Hooksmith @forkythetoy Will you be able to update to PG 13? That is the minimum version that Discourse requires at the moment. It is also harder for me to compare query plans when queries are not executed using the same PG version.

I had to revert this because the performance of the new query differs too much between users.

tgxworld · September 16, 2021, 8:08am

@blattersturm Are you still seeing topic tracking state perf being slow?

blattersturm · September 16, 2021, 8:41am

Unsure, haven’t done an upgrade in a few days now. Any commits to take in to see if anything improved?

tgxworld · September 16, 2021, 9:49am

Nope nothing changed but it you can provide me with an EXPLAIN ANALYZE on the query it’ll be helpful.

tgxworld · October 4, 2021, 1:39am

Note that we revert the all inboxes for now because of the performance concerns.

https://github.com/discourse/discourse/commit/9d5da2b383765becb824a8f3ff3665abc8e527fa

tgxworld · October 18, 2021, 8:00am

This topic was automatically closed after 14 days. New replies are no longer allowed.

Topic		Replies	Views
App/models/topic_tracking_state - `report` - long response times? Dev slow-sql	6	1752	February 26, 2016
Performance improvements on long topics? Support	30	5403	August 20, 2018
Slow SQL query causes homepage to load in 2-4 sec Support	16	1596	July 1, 2018
Private Message Page Performance Dev slow-sql	5	1803	September 13, 2017
Slow Profile Loads with 100GB+ database Feature	43	2860	January 1, 2021

Slow loading on private-message-topic-tracking-state.json

Related topics