How much is Discourse affected by a faster CPU?

mitchellk · March 20, 2017, 4:11pm

So I guess these are pretty good then

codinghorror · April 22, 2017, 10:39am

A good starting point is the /about page; you can also refer to the memory stats I quoted in my earlier reply.

codinghorror · August 24, 2017, 9:27pm

I tried on a High CPU Digital Ocean droplet:

sysbench --test=cpu --cpu-max-prime=20000 run
sysbench --test=cpu --cpu-max-prime=40000 --num-threads=8 run

Ali Express mini-pc	High CPU droplet
21.3s	24.4s
15.7s	31.6s

(in case it wasn’t obvious, lower numbers are better = faster here)

That’s… quite a bit worse than I expected. You are getting a small slice of a 16 core CPU (and it’s only Broadwell, not even Skylake!) which by definition means very low clock speeds, pushing it down to the i7-7500u levels of ~ 2.6 GHz base and 3.6 Ghz turbo.

https://ark.intel.com/compare/95451,91768

Factoring in Kaby Lake’s generational improvements… it loses by a fair bit. Per the page, they also offer Skylake Xeons at 2.7 Ghz and I think at best that would put it on par (maybe)?

ksec · May 30, 2018, 10:59am

That is not entirely true. I have no experience on Azure, but I have heard lots of good thing about it in terms of Perf. It is the best out of the three, Google, AWS and Azure. However it is still much more expensive then other cloud VPS provider like DO and Linode, mostly on bandwidth. Amazon lightsail is extremely slow so don’t even bother with it.

Linode has always offered much better CPU Perf, SSD Speed, and Better Network bandwidth. DO has manage to catch up in Network and SSD, but as far as I can tell Linode still wins on CPU performance, on most of the price plans.

It still doesn’t compare well do a dedicated box though.

codinghorror · December 30, 2018, 7:16am

For 2019 I am upgrading my mini colocated host to

i7-8750h (Coffee lake, 2.2 - 4.1, 6 core / 12 thread) - $460 on Ali Express
32GB RAM – DDR4 (16GB × 2, $220)
500 GB disk – NVMe PCI Samsung 970 Pro ($200)

CPU comparison here – it’s a big jump in TDP, 2 cores to 6 cores. Increase from 16GB to 32GB of faster DDR4 RAM. Plus a real M.2 NVMe drive, which is considerably faster than the old SATA interface style.

Full specs

Model Number: Partaker B18
Color: Black
CPU: Intel Core i7-8750H Processor (9M Cache, up to 4.10 GHz)
Platform: Coffee Lake, 8th Generation Intel Core i7 Processors
Threads: 6 Core 12 Threads
Graphic: Intel UHD Graphics 630
RAM: DDR4 2133/2400/2666 260pin，2 SO-DIMM Slots, support up to 32GB
Storage: M.2 Nvme 22x80 SSD + M.2 NGFF/Nvme 22x80 SSD + 2.5’ 'HD

WiFi: 2.4G/5G Wifi B/G/N/AC+bluetooth
Operating System System: Windows Or Linux
Network Card: 10/100/1000 BaseT LAN
I/O Port: 1xHDMI + 1 Mini DP + 1 Gigabit LAN port + 4 USB 3.0 + 1 Type-C USB port + 1 Audio Port
Power in: 5.5mm plug in
HDMI Output: HDMI Support 4K 24HZ
DP Output: DP Support 4K 60HZ
VGA Output: VGA Support 1080P
Power: In: DC100-240V AC/50-60Hz Out: DC 19V/3.42A, 90W
Operating Temperature: 0°C~80°C (32°F~140°F)
Storage Temperature: -20°C~80°C (-68°F~176°F)
Relative Humidity: 10%~90% (non-condensing)
Thermal Design: Low Noise Fan
Size: 197 x 197 x 40mm
Weight: 1.5kgs
Package Content: Partaker B18 Mini PC, Power adapter, Power Cable, VESA Bracket, Screws

@pfaffman once these are swapped out I’ll mail you my old boxes, which are still trucking along just fine.

deltamotion · January 4, 2019, 12:57am

I hope you left the stubby WiFi antennae on, they are so cute!

codinghorror · January 4, 2019, 12:57am

LOL no, I strip those out. These are going in a datacenter where WiFi would only be a negative.

codinghorror · January 15, 2019, 12:41pm

I got these boxes and they are nifty designs, a great step up in every way. Burning in overnight now.

This is a really nice, extremely compact layout! One thing that confused me is that the DIMMs are split: one goes on top half, the other goes on the bottom half.

Bottom

Top

You may notice in the second pic in each series, I removed stuff as well as installing memory and SSD. I pulled out the WiFi module and antennas, as well as the SATA bottom mount, since I don’t need it. It looks like there’s another full length NVMe port on the top side as well, which is amazing!

CPU benchmarks

this newer version of sysbench produces different numbers, it is not comparable to old versions!

sysbench cpu --cpu-max-prime=20000 run

	events
DO droplet	2988
2017 scooter	4800
2019 scooter	5671

sysbench cpu --cpu-max-prime=40000 --num-threads=8 run

	events
DO droplet	2200
2017 scooter	5588
2019 scooter	14604

Disk benchmarks

ioping -RD -w 10 .

	iops	MiB/sec
DO droplet	13.7k	53.4
2017 scooter	13.6k	53.2
2019 scooter	14.9k	58.0

dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
hdparm -Tt /dev/sda

	sequential	cached reads	buffered reads
DO droplet	701 MB/sec	8818 MB/sec	471 MB/sec
2017 scooter	444 MB/sec	12564 MB/sec	505 MB/sec
2019 scooter	1.2 GB/sec	17919 MB/sec	3115 MB/sec

Discourse rebuild times

time ./launcher rebuild app

	real	user	sys
DO droplet	6:59	1.4s	0.89s
2017 scooter	3:41	1.3s	0.85s
2019 scooter	3:24	1.7s	1.2s

Thanks for the tip on the timing command @pfaffman!

AstonJ · January 16, 2019, 4:29am

This looks like a fun topic!

Are these any good?

ioping -RD -w 10 .

474.9 MiB read, 12.6 k iops, 49.3 MiB/s

dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync

537 MB copied, 9.0306 s, 59.5 MB/s

sysbench cpu --cpu-max-prime=20000 run

General statistics:
    total time:                          10.0015s
    total number of events:              5289

sysbench cpu --cpu-max-prime=40000 --num-threads=8 run`

General statistics:
    total time:                          10.0044s
    total number of events:              13978

time ./launcher rebuild app

real 3m37.140s

user 0m2.268s

sys 0m0.740s

codinghorror · January 16, 2019, 4:49am

No need to post a bunch of details, anything in the ballpark of the mini-PC numbers shown here is very good. As you can see from my 2019 update, you get to a point where even sizable increases in disk and cpu perf produce diminishing returns.

codinghorror · January 18, 2019, 4:33am

Checking CPU scaling of the new box with i7z and stress. Reminder this is the i7-8750H with 6 cores, 12 threads, 45W TDP.

Here’s one task hogging CPU with stress --cpu 1, we definitely see 4100 Mhz as promised:

Starting with ~10w power consumption with i7z running but nothing else. Let’s exercise us some cores!

	typical clock	watts
`stress --cpu 1`	4.1 Ghz	30w
`stress --cpu 2`	4.1 Ghz	42w
`stress --cpu 3`	4.0 Ghz	53w
`stress --cpu 4`	3.9 Ghz	65w*
`stress --cpu 5`	3.7 Ghz	65w*
`stress --cpu 6`	3.5 Ghz	65w*
`stress --cpu 12`	3.3 Ghz	65w*

* It is likely more, you can really see the CPU speed limiters kick in because the watt meter will peak at 75w and then quickly dials it back down to exactly 65w.

Running current-ish versions of mprime jacks this up to 75w though, and overall clock scales down to 3.1 Ghz … those AVX2 extensions are

Power consumption seems to be about 8w totally idle at the Ubuntu login prompt. Not bad at all!

codinghorror · January 30, 2019, 9:29am

Finally got this deployed so discourse.codinghorror.com is live and running on the above hardware:

Falco · January 30, 2019, 11:52am

How are the usual timings on the mini profiler for the latest and a topic page on the new cpu?

codinghorror · January 30, 2019, 12:05pm

Not exactly apples to apples since I did not measure for the old box… but comparing the old numbers from 2017…

topic back button, 113ms avg (prev 206ms)
topic refresh, 179ms avg (prev 127ms)
latest refresh, 131ms avg (prev 140ms)

This is also unfortunately comparing 2017 Discourse with 2019 Discourse… it does look like we regressed a bit on topic refresh perf since then maybe?

codinghorror · May 8, 2019, 10:37pm

@pmusaraj recently improved our build yesterday to skip unused locale compression, here’s what I get with

time ./launcher rebuild app

now on the very same machine:

	real	user	sys
2019 scooter	2:40	0.2s	0.1s

Amazing! that is a 25% improvement!

(I rebuilt twice, because the timing of the first run is affected by the image download time which isn’t technically rebuild time.)

codinghorror · September 3, 2019, 6:30am

@falco just made a new base image so I initiated a rebuild and tested build time again. Before:

	real	user	sys
2019 scooter	2:40	0.2s	0.1s

After

	real	user	sys
2019 scooter	2:34	0.2s	0.1s

codinghorror · December 17, 2019, 12:16am

I think we have a massive regression in rebuild times currently @falco? I’m seeing 5+ minute rebuilds now where it used to be under 3, on these same machines?

Falco · December 17, 2019, 1:17am

Oh since dependabot was enabled we are doing quite a bit of gem upgrades. I’d say a new images should be created in a few days after we finish the bulk of those updates. Just a nokogiri gem update can take so long.

Falco · December 19, 2019, 11:28pm

Just rebuild an instance using a new base image and it went from 5′6″ to to 3′11″.

New image will be tested on Meta over the weekend and released next week for everyone.

Topic		Replies	Views
I just hit my CPU cap on the Digital Ocean 2GB/2xCPU plan Hosting	35	17513	April 30, 2018
My discourse speed is very slow Installation	24	4640	March 4, 2021
Recommended Hosting Providers for Self Hosters Hosting	109	29413	April 16, 2025
Is the 6$ DO droplet enough? Hosting	26	2631	February 2, 2023
Discourse installation has been getting slower and slower and slower Installation server-resources	37	1528	May 15, 2023

How much is Discourse affected by a faster CPU?

Bottom

Top

CPU benchmarks

Disk benchmarks

Discourse rebuild times

Related topics