Optimizing the number of Unicorns and buffer size

I know that Discourse setup provides sane defaults for most occasions nowadays, but I still want to ask some guidance on optimal settings. It seems that we will get a hosting partner that provides us with the server and what is on the table is this VPS:

  • 6x vCore CPUs (Xeons @ 2.40GHz, I beleive)
  • 8GB of RAM

My primary goal is to tweak the settings for maximum capability of handling traffic spikes, ie. concurrent users. How would you experts set the number of unicorns and the size of the database buffer?

4 Likes

The usual, which is

  • DB gets 1/4 of memory
  • Up to two unicorns per “real” CPU core (I’d limit to 8 for now though)
4 Likes

Found the limits of our setup yesterday, with about 400+ concurrent sessions and many of them being logged in, actively polling and chatting. The day before we delivered 138k page views without a hitch.

The bottleneck was the number of unicorns (8). We only reached about 35% CPU loads and there were 2GBs of free RAM with these settings. I changed to 10 unicorns, after which we had no more hiccups and served a peak of 440 sessions, and got CPU loads closer to 50%.

But then it was getting late and things cooled down and I wasn’t able to do any more live performance testing with max load.

Anyway, with UpClouds 6-core / 8GB RAM plan (or similar) I would say that 10 unicorns is better than 8. I’ll be looking at the amount of free RAM and and lets see if even 11-12 would be feasible. Ping @mpalmer?

5 Likes

With these changes the amount of free memory has reduced from 2GB to 1GB, so roughly -0.5 gigs per unicorn added.

Is there a recommendation for the amount of RAM to keep free, thus available for Linux to use as disk cache? I have 1/4 of the RAM (=2GB) allocated to the database buffer.

Increasing the number of unicorn workers to suit your CPU and RAM capacity is perfectly reasonable. The “two unicorns per core” guideline is a starting figure. CPUs differ (wildly) in their performance, and VPSes make that even more complicated (because you can never tell who else is on the box and what they’re doing with the CPU), so you start conservative, and if you find that you’re running out of unicorns before you’re running out of CPU and RAM, then you just keep increasing the unicorns.

As far as disk cache goes, you need as much as you need. As you increase RAM consumption, keep an eye on your IOPS graphs, and particularly the percentage utilisation of the disks (what sysstat refers to as %util). There’s two points you want to be aware of: when your read IOPS start to stay persistently higher (that means that the working set of disk pages no longer fits in memory – you may or may not have hit that already), and when peaks in disk utilisation start to get close to 100%. The former tells you when your RAM consumption is starting to impact performance, and the latter tells you when you’re starting to saturate the disks. You want to consume RAM up to the first point, and you can drive it as far as you’re comfortable towards the second point, depending on your tolerance for performance degradation (keep an eye on your service times!).

Another RAM-related metric to keep an eye on is swap rate. That’s not how much swap space is being used (as long as your swap partition isn’t full, it doesn’t matter), but the number of pages per second being written to and read from swap. Swap writes are fine, but if the system is constantly swapping pages in and out, even only a few per second, you probably want to back off on your RAM usage a bit.

Just to keep you on your toes, swap activity counts towards disk IOPS, too, so your disk utilisation will likely go loco bananas when you start to run out of RAM, due to both extra disk reads (because the cache isn’t big enough) and increased swap activity, because the working set of memory pages doesn’t fit any more. That’s a recipe for performance disaster right there.

15 Likes

Thanks Matt. Crystal clear and informative, as usual.

Here is a some more nerd porn. Our API requests for this year. This is what it looks like when there is a hockey league transfer deadline on the 15th of Feb at 23:59 local time.

3 Likes

What tools are you using for monitoring the api requests, which is shown from your image?

If I’m not mistaken, that’s Discourse’s built in report visualizer. Go to yoursite.com/admin, then click any of the statistic titles.

8 Likes

Bump.

So are you saying 8 Unicorns is the upper limit on any server? I have a server with 8x CPU and was thinking changing the config to use 16?

Maybe, it depends how much traffic you have. 16 unicorns sitting around doing nothing and waiting for incoming requests would not get you much.

1 Like

800k-1.2m pageviews/month. Discourse runs buttery smooth most of the time but then if we ever get 300 concurrent users it grinds to a halt.

Already have db_shared_buffers configured to 25% of the host’s 32gb RAM, looking for more possible fixes.

1 Like

Sounds like it could help. Please report back.

Done. We’ll see, not expecting another traffic spike for awhile.

I am currently using 10 unicorns (IIRC) on my 6-core instance at UpCloud. This gets me above 400+ concurrents. We’ll see in February what it takes to choke this.

2 Likes

Neat. I guess I’m just wondering if there’s a tipping point where having a too high number of unicorns is actually detrimental.

1 Like

You know, there is only one way to find out … measure.

3 Likes

Bump.

Forum appears to do much better under load but live reloading of posts and updating of notifications is sluggish.

Do you use an external reverse proxy or cloudflare?

Cloudflare yes

20 char

We had several reports of people using cloudflare proxy feature and having problems with our live updates. Some people claim disabling brotli support helps, but we recommend using Cloudflare in DNS only mode (gray cloud).

5 Likes