Improving Instance Performance (Megatopics, Database Size and Extreme Load)

Agree, what is the value of content that nobody reads? And which people are gonna actually sit down and read 10,000 posts from start to finish? :crazy_face:

It’s OK for some topics to be ephemeral chats which disappear entirely over time, what we’ve previously called “fast lane” versus “slow lane” traffic.

5 Likes

Small Update:

Since reported and agreed on closing everything above the mark (and re-establishing the limits) performance seems to be better, a lot better. We still get some “Due to extreme load, this is temporarily being shown to everyone as a logged out user would see it” but considerably lower.

With that said:

  • Thanks for the hard work looking into how to reduce that monster table, really appreciate it.
  • Are there any other “Performance Tips” or even “Setups” that we may be missing? Even if they are “advanced”, any help is appreciated.

Again, a huge thanks to everyone for helping and giving feedback.

3 Likes

Postgres 12 will help as it reduces table and index size by 20% in @falco’s testing.

3 Likes

Is there a target date for that yet?

1 Like

I started doing bechmarks today.

Gotta let it run for a while here on Meta so we can watch for performance regressions before rolling it out everywhere.

5 Likes

I’m sorry for rising this one up again, but this is a different issue from the PostGreSQL Migration. (Which I’m not able to do yet due to size).

Since last rebuild my Database Size won’t stop increasing:

image

Last rebuild was on May 17th and since then the DB Size won’t stop increasing, reaching 57.3GB and the big size is on the post_timings table.

My main issue with this is that I’m trying to do the PostGreSQL update (which will reduce index size by 20% but won’t solve this in the long run). And from the comments here from the Staff that size is not the norm, so it will keep increasing and becoming nightmareshly expensive. The more time passes, the more it increases and creates a loop that becomes hell to maintain. So my main issue prevails, is there a way to tackle this post_timings thing? Something to delete?

Can I compact tables or something?

Thanks to everyone for their help.

1 Like

There is no way around it at the time. If your forum is really big, it will have a big timings table.

Meta is a really old instance, but it is mid-sized and has a small 4GB post_timings table. In the other hand we host one instance that is less than 2 years old and the post_timings table should be over 100GB by now.

Hosting big forums will cost you more, and there are no ways around it today.

Maybe move your Database to a standalone $20 droplet (80GB disk) and put the web in another $10 droplet? $30 in monthly costs for what appears to be a quite big community sounds reasonable.

4 Likes

Thank you very much for your help, @Falco.

Welp, yeah, as I said, just asking because there isn’t magic when it comes to space. I’m looking into the division but then the app one will be too slow due to perf (long story, I’m taking note of the tests and will present them here later for everyone else to make use if they find them useful).

I did the test I asked you about regarding recovering a backup and stuff and I think that this can be a good situation to leverage from since what I can immediately see is that there is 30% less disk usage (I’m still running some test to check if there isn’t anything missing) but now there is an small issue with this approach so even though the immediate benefits are great this will generate some issues (even more because I don’t know if it is cached or not working at all in terms of stored images, and yes, the backup includes them).

I’m taking notes so that I can update the OP, my idea here would be to add a small series of notes for people that may worry about performance and all the things I’ve been modifying so far.

Is applying “auto-delete replies” timer good solution to this technically wise?

1 Like

Actually this is a pretty good idea and solves the problem of usability (because, as we said, no one is going to read 10K messages). So the big question is if that would be taxating to the server and database.

3 Likes

I’m not sure if a 9000 reply topic with ~8600 of them deleted is good for perf, but somehow I doubt it. What say you @eviltrout?

1 Like

I thought that “hidden” posts are completely removed from database after some time, but now I see that is probably not the case.
So performance wise my idea can’t solve the problem.

Is there any way to “purge” this data?

Deleted posts are soft deleted. However, they are often indexed so you might notice some improvements when a bunch are removed. I wouldn’t recommend it though. If there’s any way to move to a new topic once one gets large that might serve you well.

3 Likes

What do you mean by this? Having the database and the web app in separate servers should give your more performance, because while there will be some latency between servers, your Unicorn and Postmaster won’t have to compete for CPU and RAM.

Make sure the servers are in the same DO region :stuck_out_tongue:

3 Likes

Sorry, you are right, that would free them both and get better performance than everything on a single one (I was comparing to the resources I’m using on a single machine and the tweaks I’ve been doing so far).

That is actually a very good idea, let me try to solve the “unable to rebuild the data container” issue I have and that will be my next jump in this journey.

1 Like