Troubleshooting severe performance issues with latest Discourse?

Hi All,

So we upgraded our Centos7 Server to 2.7.0.beta4 from 2.2.2 and since then facing latency in page loading. Especially in pages with database or image contents involved. To an extent where it has become unusable.

Any guidance in this regard would be much appreciated.

A bunch of things happened in the past few years. There was a bug change that requires processing all of the images. I suspect that your server is slammed doing that work. You can have a look at /sidekiq to see the queue.

How big is your database? How many images? What does sidekiq show? You’re using SSD, right?

6 Likes

Its a VM based server so not sure if its SSD or not.
I don’t see sidekiq accessible, since this deployment wasn’t done by me so not sure how to access that.

Do you know how it was installed? It sounds like it’s not a standard install (else, /sidekiq would be available to you if you were an admin).

2 Likes

Your best path forward is to investigate why performance is down. There are a lot of background jobs added over the years (image optimization, rebaking, etc.) that are probably now running and using your server resources. Once those complete, performance should improve.

Accessing /sidekiq (using an admin account!) to discover what jobs are running is a great first step.

2 Likes

Ok so I was able to access Sidekiq, can you guys help me understand this and suggest any optimizations? I am in a quite a fix here due to these performance issues.

The behaviour I see with the server that it keeps showing this empty queue even then I try to open a post to see it being listed but the sidekiq portal also jamms up when post is being loaded and only refreshes once post is fully loaded.

Also once again when it’s loaded it shows an empty queue. Any help/suggestions would be highly appreciated.

If the queue is empty then you don’t have the problem that many background jobs are running. So it’s something else.

Do you have any plugins? Do you have theme components that are making lots of api calls?

I asked some other questions above.

How big is your database?
How many images?
Do you have theme components that are making lots of api calls?

Can you let me know how I can figure out the above info from a given docker based setup? I know that the last back up is of 135 MBs

As for plugins, yes we have these plugins listed:

     - git clone https://github.com/discourse/docker_manager.git
      - git clone https://github.com/jonmbake/discourse-ldap-auth.git
      - git clone https://github.com/discourse/discourse-math
      - git clone https://github.com/discourse/discourse-chat-integration.git
      - git clone https://github.com/discourse/discourse-voting.git
      - git clone https://github.com/unfoldingWord-dev/discourse-mermaid.git
      - git clone https://github.com/discourse/discourse-solved.git
      - git clone https://github.com/discourse/discourse-assign.git
      - git clone https://github.com/discourse/discourse-knowledge-explorer.git
      - git clone https://github.com/discourse/discourse-cakeday.git

I recommend removing the mermaid plugin.

How many posts and users do you have? Traffic?

How much ram?

It looks like you’d be fine with a 2gb digital ocean droplet; you might spin one up and see how that works.

Maybe there is some other issue with your server? Is it up to date? Has it been rebooted lately?

Ok, I’ll remove that.

We have approx 4k posts and around 350 users.

Average users logged in at the same time is not very high maybe 5-10 at max in average.

This server was raised recently and is shared with 8GB RAM 10 GB of swap space. And has jusst been up for 13 days at present. But performance issues are regardless of reboot and uptime.

3 Likes

There is definitely something wrong with your install; you should be getting much better performance with this hardware.

Try having postgres do an explicit vacuum. If you’re running the all-in-one container install:

# docker exec -it -u postgres app psql discourse
psql (13.1 (Debian 13.1-1.pgdg100+1))
Type "help" for help.

discourse=# VACUUM ANALYZE;
VACUUM

How many unicorn workers do you have set in your app.yml?

You can ask Discourse to set additional performance headers in responses with the following in your env section:

DISCOURSE_ENABLE_PERFORMANCE_HTTP_HEADERS: true

While you’re at it, you can enable miniprofiler by following this post.

5 Likes

That should be quite enough.

Can’t remember if it was suggested and you re-ran discourse-setup to adjust Discourse’s memory usage, or if those defaults are reasonable given whatever else is using the server.

If you didn’t re-index the database after the PG13 upgrade, then you might have a look at PostgreSQL 13 update for some information about that.

2 Likes

Oh yes, lack of table statistics (VACUUM ANALYZE) is the most likely culprit here.

2 Likes

VACUUM FULL VERBOSE;

REINDEX DATABASE discourse;

VACUUM VERBOSE ANALYZE;

So I ran these commands and set the header in env too, but do not see much of difference in page loading time.

I’m running 8 unicorns.


:frowning:

You ran those commands in postgres, right?

1 Like

Yes ran docker exec -it -u postgres app psql discoursebefore I ran the above commands.

1 Like

Well, that’s all very strange. No one else has had such problems. You seem to have enough hardware. My only guess is some issue with a reverse proxy (I guess you’ve got other stuff on the machine?).

yes another docker based service.
But nothing really performance intensive at all since that would have shows in the performance metrics of the machine.

2 Likes