How to resolve HTTP 502 error?

I set up discourse on a instance that contains 8GB of memory and a quad-core processor on Microsoft Azure, and I run into HTTP 502 problems these days. Especially when there are about 200 to 300 registered users visiting at the same time, the server goes wrong with HTTP 502 error. The server logs shows memory usage and CPU usage are high (above 90%) during peak times.I guess that is because each connection takes up too much resource, but I don’t know how to optimize.

So I wonder what I can do to resolve this, except upgrading hardware configurations. Thank you for any advice! :slight_smile:

1 Like

How was the server built? Did you use an image, or follow the cloud install guide?

2 Likes

How big is your database? You should have enough ram to hold the entire database in cache.

There are settings in the yml file for tuning memory usage. If you run discourse-setup it attempts to set them but you might need to tweak them.

Were you running Discourse somewhere else with similar traffic and didn’t have these problems?

3 Likes

If it is a very large database, that advice changes to “most” of the database in cache. At least half.

2 Likes

i built the sever by following the guide

I have not tried to build another server, but when I switch the server’s ram to 4GB, things get worse.

What happens if you have 16gb of ram?

My purse would explode, I guess. :smile:
What yml files do I need to modify to improve this situation ?

It’s impossible to tell with no information about the size or your database or how much traffic you have.

If you run ./discourse-setup it will attempt to set the various memory tweaks sanely. or, you can read what’s in app.yml and do it by hand.

There are other things you can do, like enable CDN and/or move uploads to S3.

3 Likes

@pfaffman Thanks for your advice.
I have set ‘db_shared_buffers’ to 4096MB in app.yml and rebuild. The problem seems to occur less frequently after making that change, but still remains unsolved.

postgres=# select pg_database_size('discourse');
 pg_database_size 
------------------
      10433590759
(1 row)

If my calculations are correct, the size of database should be 9.72GB.

When the last time HTTP 502 error occured, the server received about more than 270,000 requests from 789 ips in 2 hours.
I got these two numbers using the command below:

root@jf2-app:/var/log/nginx# awk '$1 >="[02/Mar/2019:01:30:00" && $1 <= "[02/Mar/2019:03:30:00"' access.log.1 | awk {'print $4'} | sort | uniq -c | wc -l              
789
root@jf2-app:/var/log/nginx# awk '$1 >="[02/Mar/2019:01:30:00" && $1 <= "[02/Mar/2019:03:30:00"' access.log.1 | wc -l
271516

At this scale I recommend dedicated PG resources, any reason you are not running your database on a dedicated instance or managed PG?

2 Likes

I am pretty sure the q23 instance has a DB this big and they are not on dedicated PG.

Was thinking about the load here vs the DB size 135k reqs an hour is probably a lot especially if this is peaking in some times at say 1000/sec due to the way traffic works.

Though step 0 here @xuchao_007, get a CDN in place so you can get a good breakdown of dynamic vs static requests. Unclear how much of this traffic is serving images vs the server doing work.

4 Likes