Slow loading on private-message-topic-tracking-state.json

Oh I see, that will struggle with this query. After updating to latest, can you please extract the query from the mini_profiler and run a explain analyze on it and share with us?

On Meta (RDS db.r6g.large 30GB) it looks like this:

https://explain.depesz.com/s/El78

1 Like

I’ll try running explain analyze again, but we haven’t been able to get that query to successfully complete.

Update: this commit didn’t change anything. Database still roaring at >95% constantly after update and killing all old running queries. At beta4 it used to run at a 20-30% stable cadence for this forum. We already ran an auto-vacuum and schema reindex as well.

How do we disable this feature? It doesn’t seem right to get a larger database instance to resolve this, there seems to be something wrong with how this query is computed based on the fact it ran great before this update.

Query managed to finish after ~21 minutes

https://explain.depesz.com/s/gs84

Also to note we are on PG 10.17 as it seems like our outputs differ quite a bit.

1 Like

We monkey-patched the issue by re-routing all */private-messages-all/* routes to just */private-messages/* temporarily. The result is that “all inboxes” has the same content as “Personal”, but at least we don’t need to deal with 100% CPU usage constantly this way.

Monkeypatch code:

# name: discourse-private-messages-perf-hotfix
# version: 0.0.1
# authors: 

# Prepend to override existing routes
Discourse::Application.routes.prepend do
  scope path: nil, constraints: { format: /(json|html|\*\/\*)/ } do
    scope "/topics", username: RouteFormat.username do
      # Reroute all */private-messages-all/* routes to go to */private-messages/* instead (personal messages)
      # Former is expensive, latter is cheap, potentially saves significant database CPU usage
      get "private-messages-all/:username" => "list#private_messages", as: "topics_private_messages_override", defaults: { format: :json }
      get "private-messages-all-sent/:username" => "list#private_messages_sent", as: "topics_private_messages_sent_override", defaults: { format: :json }
      get "private-messages-all-new/:username" => "list#private_messages_new", as: "topics_private_messages_new_override", defaults: { format: :json }
      get "private-messages-all-unread/:username" => "list#private_messages_unread", as: "topics_private_messages_unread_override", defaults: { format: :json }
      get "private-messages-all-archive/:username" => "list#private_messages_archive", as: "topics_private_messages_archive_override", defaults: { format: :json }
    end
  end
end

CPU usage of our forum database after deploying the monkeypatch above:

@tgxworld You’ll want to take another look at whatever the */private-messages-all/* routes are doing, there’s clearly something wrong with how it’s implemented, it’s not efficient enough for large forum databases. Either the implementation is not using the right database index or the query generation is resulting in extremely expensive queries (particular on PSQL 10, not sure about 12/13?).

The current implementation basically crippled our forum, increasing CPU from ~15% to constant 100% and causing slow performance on all other forum features. I don’t see any reason why Personal / group inbox queries take <50ms but the “all inboxes” ones take 20+ minutes to complete.

You can use the analyze dump @forkythetoy posted just above to see the 20m+ run before.

1 Like

@Falco I just noticed that you merged our topic here, but this actually seems to be about a different endpoint. This bug report is about private-message-topic-tracking-state endpoint, but we’re talking about */private-messages-all/*. This might be causing some of the confusion here, my apologies for that. (I initially linked to this one which might have started the mix-up)

The private-message-topic-tracking-state is fast on our forum so this is not the issue for us.

For us this one is taking ~200-300ms of database time. A bit longer than one may expect but still within normal ranges, yeah.

We’re on Postgres 13, however.

1 Like

@Hooksmith I have a fix in the pipeline

5 Likes

@Hooksmith @forkythetoy Will you be able to update to PG 13? That is the minimum version that Discourse requires at the moment. It is also harder for me to compare query plans when queries are not executed using the same PG version.

I had to revert this because the performance of the new query differs too much between users.

1 Like

@blattersturm Are you still seeing topic tracking state perf being slow?

Unsure, haven’t done an upgrade in a few days now. Any commits to take in to see if anything improved?

Nope nothing changed but it you can provide me with an EXPLAIN ANALYZE on the query it’ll be helpful.

1 Like

Note that we revert the all inboxes for now because of the performance concerns.

3 Likes

This topic was automatically closed after 14 days. New replies are no longer allowed.