How much is Discourse affected by a faster CPU?


(Jon Rurka) #7

My results using the same tests on a $8.99/month VPS from quickpacket:

CPU benchmark:

single thread: 28.8s
multi thread:  18.9s

Disk benchmark:

10.6 k requests completed in 9.82 s, 41.3 MiB read, 1.08 k iops, 4.21 MiB/s
generated 10.6 k requests in 10.0 s, 41.3 MiB, 1.06 k iops, 4.13 MiB/s

(Jay Pfaffman) #8

Have you got details on how someone else might get this deal? I remember something about this a while back. The $15/month requires that you first drop a few hundred bucks on the box, right?

Still, I’m interested. Where to get box? How to arrange colocation?


(Matt Palmer) #9

Yes, swap activity is the best baseline measure, as in “if you see consistent swap activity, you definitely don’t have enough memory”. You can get a bit more performance by ensuring you also have enough memory to store the entire working set of disk pages in memory (I’ve talked about this before, specifically the paragraph that starts, “As far as disk cache goes”), but definitely if you’re swapping, your performance is going to be viciously destroyed.

BTW, for anyone comparing the costs and thinking, “OMFG that’s not worth it”, consider that the colocated server is significantly more powerful than the droplet (8x RAM, etc). The closest equivalent droplet is $160/month, so if you were replacing that droplet size with a colo box you’d make your money back in about five months… sure, time is money, etc etc, but stable hardware doesn’t take that much time to keep an eye on.


(Sergio Castiñeyras) #10

you want the fastest possible single threaded performance

To compensate for the crappiness of RoR on multithreading


(Jeff Atwood) #11

The multithreading is fine, that covers concurrent requests. But no individual request will go through any faster.


(Jeff Atwood) #12

Some command line rebuild numbers:

cd /var/discourse
git pull
./launcher rebuild app

I did this in two consoles, triggered by pressing enter at the exact same time on the last command. I stopped the clock when the command line prompt returned from the rebuild.

Digital Ocean droplet

(different droplet, but I verified same E5-2630L CPU as in first post, 2GB $20/month droplet)

37:45 → 46:51 = 9 minutes, 6 seconds (546 seconds)

Ali Express box

37:45 → 41:38 = 3 minutes, 53 seconds (233 seconds)


So a rebuild is 2.3x faster.

I know @sam has been working on rebuild speed improvements for the last 2 days so I thought he might be interested as well.


Digital Ocean "High CPU" droplets
Improving pain points on bootstrap and rebuild
What's the different between rebuild and bootstrap
(Jay Pfaffman) #13

That sounds pretty darn good.

FWIW,

time ./launcher rebuild app

will save you from using your watch. :wink:


#14

Do you mind me asking where you colocate a server for $15/month? Last time I colocated in 2008 or so I was paying $350/month for a 1U server.


(Jeff Atwood) #15

Not really a server per se, a mini pc more analogous to Mac mini hosting services. But the perf as you can see is excellent.


(ljpp) #16

Nice comparison, thanks!

It is worth mentioning that Digital Ocean’s VPSs are on the slow side, when you compare head to head with alternatives that have a similar price tag. There are reputable hosting providers that offer roughly twice as fast single core performance.


(Jeff Atwood) #17

The “name brands” such as Linode and AWS and Azure and Digital Ocean are mostly pretty close in CPU perf. The weirder new providers can be faster – or a whole lot slower.


(Christoph) #18

Which one do you have in mind here? “A lot faster” I can see, also “more cores for less money”, but twice as fast?


(ksec) #19

I think that is one of the sad thing, it means with the current Ruby Runtime, or Discourse, that is about as fast as we can get. Since We have more or less reached peak single thread performance.


(Jeff Atwood) #20

The 4.2 ghz Skylake and 4.5 ghz Kaby Lake are significantly faster than this, as they have a bit more cache and of course a clock rate higher than 3.5 Ghz. The Ali express box is 15w tdp compared to 90w tdp of those.

Speed shift aka hardware CPU clock control is better on Kaby Lake as well so it goes faster… faster, when it needs to. It will hit higher turbo more often and quicker.


(Matt Palmer) #21

On the upside, with all the speed improvements which are apparently coming down the pipe in Ruby 3, we’ll automatically get some tidy speed ups. Single-threaded performance almost always has a strong impact on web application performance, whatever the language or framework, because it’s a problem that strongly resists parallelisation. I can’t think of any languages or frameworks that do much, if any, of the page generation in parallel.


(ljpp) #22

Here are a couple of threads with VPSBench results and threads about specific hosting companies.

UpCloud.com is the fastest I have personally tried and they are nearly twice as fast (single core perf, rebuild times). LeaseWeb also performs nicely in terms of CPU speed.


(Mitchell Krog) #23

So I guess these are pretty good then :grinning:


(Jeff Atwood) #24

A good starting point is the /about page; you can also refer to the memory stats I quoted in my earlier reply.


(Jeff Atwood) #25

I tried on a High CPU Digital Ocean droplet:

sysbench --test=cpu --cpu-max-prime=20000 run
sysbench --test=cpu --cpu-max-prime=40000 --num-threads=8 run

Ali Express mini-pc High CPU droplet
21.3s 24.4s
15.7s 31.6s

(in case it wasn’t obvious, lower numbers are better = faster here)

That’s… quite a bit worse than I expected. You are getting a small slice of a 16 core CPU (and it’s only Broadwell, not even Skylake!) which by definition means very low clock speeds, pushing it down to the i7-7500u levels of ~ 2.6 GHz base and 3.6 Ghz turbo.

Factoring in Kaby Lake’s generational improvements… it loses by a fair bit. Per the page, they also offer Skylake Xeons at 2.7 Ghz and I think at best that would put it on par (maybe)?


(ksec) #26

That is not entirely true. I have no experience on Azure, but I have heard lots of good thing about it in terms of Perf. It is the best out of the three, Google, AWS and Azure. However it is still much more expensive then other cloud VPS provider like DO and Linode, mostly on bandwidth. Amazon lightsail is extremely slow so don’t even bother with it.

Linode has always offered much better CPU Perf, SSD Speed, and Better Network bandwidth. DO has manage to catch up in Network and SSD, but as far as I can tell Linode still wins on CPU performance, on most of the price plans.

It still doesn’t compare well do a dedicated box though.