Troubleshooting severe performance issues with latest Discourse?

Sirshad · March 12, 2021, 10:40am

Hi All,

So we upgraded our Centos7 Server to 2.7.0.beta4 from 2.2.2 and since then facing latency in page loading. Especially in pages with database or image contents involved. To an extent where it has become unusable.

Any guidance in this regard would be much appreciated.

pfaffman · March 12, 2021, 11:22am

A bunch of things happened in the past few years. There was a bug change that requires processing all of the images. I suspect that your server is slammed doing that work. You can have a look at /sidekiq to see the queue.

How big is your database? How many images? What does sidekiq show? You’re using SSD, right?

Sirshad · March 30, 2021, 3:03pm

Its a VM based server so not sure if its SSD or not.
I don’t see sidekiq accessible, since this deployment wasn’t done by me so not sure how to access that.

pfaffman · March 30, 2021, 5:16pm

Do you know how it was installed? It sounds like it’s not a standard install (else, /sidekiq would be available to you if you were an admin).

supermathie · March 30, 2021, 6:13pm

Your best path forward is to investigate why performance is down. There are a lot of background jobs added over the years (image optimization, rebaking, etc.) that are probably now running and using your server resources. Once those complete, performance should improve.

Accessing /sidekiq (using an admin account!) to discover what jobs are running is a great first step.

Sirshad · March 31, 2021, 10:49am

Ok so I was able to access Sidekiq, can you guys help me understand this and suggest any optimizations? I am in a quite a fix here due to these performance issues.

Sirshad · March 31, 2021, 11:28am

The behaviour I see with the server that it keeps showing this empty queue even then I try to open a post to see it being listed but the sidekiq portal also jamms up when post is being loaded and only refreshes once post is fully loaded.

Also once again when it’s loaded it shows an empty queue. Any help/suggestions would be highly appreciated.

pfaffman · March 31, 2021, 12:11pm

If the queue is empty then you don’t have the problem that many background jobs are running. So it’s something else.

Do you have any plugins? Do you have theme components that are making lots of api calls?

I asked some other questions above.

Sirshad · March 31, 2021, 12:52pm

How big is your database?
How many images?
Do you have theme components that are making lots of api calls?

Can you let me know how I can figure out the above info from a given docker based setup? I know that the last back up is of 135 MBs

As for plugins, yes we have these plugins listed:

     - git clone https://github.com/discourse/docker_manager.git
      - git clone https://github.com/jonmbake/discourse-ldap-auth.git
      - git clone https://github.com/discourse/discourse-math
      - git clone https://github.com/discourse/discourse-chat-integration.git
      - git clone https://github.com/discourse/discourse-voting.git
      - git clone https://github.com/unfoldingWord-dev/discourse-mermaid.git
      - git clone https://github.com/discourse/discourse-solved.git
      - git clone https://github.com/discourse/discourse-assign.git
      - git clone https://github.com/discourse/discourse-knowledge-explorer.git
      - git clone https://github.com/discourse/discourse-cakeday.git

pfaffman · March 31, 2021, 1:58pm

I recommend removing the mermaid plugin.

How many posts and users do you have? Traffic?

How much ram?

It looks like you’d be fine with a 2gb digital ocean droplet; you might spin one up and see how that works.

Maybe there is some other issue with your server? Is it up to date? Has it been rebooted lately?

Sirshad · March 31, 2021, 2:37pm

Ok, I’ll remove that.

We have approx 4k posts and around 350 users.

Average users logged in at the same time is not very high maybe 5-10 at max in average.

This server was raised recently and is shared with 8GB RAM 10 GB of swap space. And has jusst been up for 13 days at present. But performance issues are regardless of reboot and uptime.

supermathie · March 31, 2021, 3:14pm

There is definitely something wrong with your install; you should be getting much better performance with this hardware.

Try having postgres do an explicit vacuum. If you’re running the all-in-one container install:

# docker exec -it -u postgres app psql discourse
psql (13.1 (Debian 13.1-1.pgdg100+1))
Type "help" for help.

discourse=# VACUUM ANALYZE;
VACUUM

How many unicorn workers do you have set in your app.yml?

You can ask Discourse to set additional performance headers in responses with the following in your env section:

DISCOURSE_ENABLE_PERFORMANCE_HTTP_HEADERS: true

While you’re at it, you can enable miniprofiler by following this post.

pfaffman · March 31, 2021, 8:16pm

That should be quite enough.

Can’t remember if it was suggested and you re-ran discourse-setup to adjust Discourse’s memory usage, or if those defaults are reasonable given whatever else is using the server.

If you didn’t re-index the database after the PG13 upgrade, then you might have a look at PostgreSQL 13 update for some information about that.

riking · March 31, 2021, 8:39pm

Oh yes, lack of table statistics (VACUUM ANALYZE) is the most likely culprit here.

Sirshad · April 1, 2021, 12:03pm

VACUUM FULL VERBOSE;

REINDEX DATABASE discourse;

VACUUM VERBOSE ANALYZE;

So I ran these commands and set the header in env too, but do not see much of difference in page loading time.

I’m running 8 unicorns.

Sirshad · April 1, 2021, 12:09pm

pfaffman · April 1, 2021, 12:10pm

You ran those commands in postgres, right?

Sirshad · April 1, 2021, 12:17pm

Yes ran docker exec -it -u postgres app psql discoursebefore I ran the above commands.

pfaffman · April 1, 2021, 12:20pm

Well, that’s all very strange. No one else has had such problems. You seem to have enough hardware. My only guess is some issue with a reverse proxy (I guess you’ve got other stuff on the machine?).

Sirshad · April 1, 2021, 12:25pm

yes another docker based service.
But nothing really performance intensive at all since that would have shows in the performance metrics of the machine.

Topic		Replies	Views
Could sidekiq queue be reason for 500 errors? Installation server-resources	31	3798	July 13, 2018
Best configurations for speeding up standalone discourse Installation server-resources	26	4008	February 14, 2024
Sidekiq has a lot of errors and queued jobs Support	19	989	March 1, 2024
Discourse installation has been getting slower and slower and slower Installation server-resources	37	1532	May 15, 2023
Memory creep in last couple of updates Hosting	27	2676	May 21, 2019

Troubleshooting severe performance issues with latest Discourse?

Related topics