How much is Discourse affected by a faster CPU?

(Mitchell Krog) #23

So I guess these are pretty good then :grinning:

1 Like
(Jeff Atwood) #24

A good starting point is the /about page; you can also refer to the memory stats I quoted in my earlier reply.

(Jeff Atwood) #25

I tried on a High CPU Digital Ocean droplet:

sysbench --test=cpu --cpu-max-prime=20000 run
sysbench --test=cpu --cpu-max-prime=40000 --num-threads=8 run

Ali Express mini-pc High CPU droplet
21.3s 24.4s
15.7s 31.6s

(in case it wasn’t obvious, lower numbers are better = faster here)

That’s… quite a bit worse than I expected. You are getting a small slice of a 16 core CPU (and it’s only Broadwell, not even Skylake!) which by definition means very low clock speeds, pushing it down to the i7-7500u levels of ~ 2.6 GHz base and 3.6 Ghz turbo.

Factoring in Kaby Lake’s generational improvements… it loses by a fair bit. Per the page, they also offer Skylake Xeons at 2.7 Ghz and I think at best that would put it on par (maybe)?

(ksec) #26

That is not entirely true. I have no experience on Azure, but I have heard lots of good thing about it in terms of Perf. It is the best out of the three, Google, AWS and Azure. However it is still much more expensive then other cloud VPS provider like DO and Linode, mostly on bandwidth. Amazon lightsail is extremely slow so don’t even bother with it.

Linode has always offered much better CPU Perf, SSD Speed, and Better Network bandwidth. DO has manage to catch up in Network and SSD, but as far as I can tell Linode still wins on CPU performance, on most of the price plans.

It still doesn’t compare well do a dedicated box though.

(Jeff Atwood) #27

For 2019 I am upgrading my mini colocated host to

  • i7-8750h (Coffee lake, 2.2 - 4.1, 6 core / 12 thread) - $460 on Ali Express
  • 32GB RAM – DDR4 (16GB × 2, $220)
  • 500 GB disk – NVMe PCI Samsung 970 Pro ($200)

CPU comparison here – it’s a big jump in TDP, 2 cores to :warning: 6 cores. Increase from 16GB to 32GB of faster DDR4 RAM. Plus a real M.2 NVMe drive, which is considerably faster than the old SATA interface style.

Full specs

Model Number: Partaker B18
Color: Black
CPU: Intel Core i7-8750H Processor (9M Cache, up to 4.10 GHz)
Platform: Coffee Lake, 8th Generation Intel Core i7 Processors
Threads: 6 Core 12 Threads
Graphic: Intel UHD Graphics 630
RAM: DDR4 2133/2400/2666 260pin,2 SO-DIMM Slots, support up to 32GB
Storage: M.2 Nvme 22x80 SSD + M.2 NGFF/Nvme 22x80 SSD + 2.5’ 'HD

WiFi: 2.4G/5G Wifi B/G/N/AC+bluetooth
Operating System System: Windows Or Linux
Network Card: 10/100/1000 BaseT LAN
I/O Port: 1xHDMI + 1 Mini DP + 1 Gigabit LAN port + 4 USB 3.0 + 1 Type-C USB port + 1 Audio Port
Power in: 5.5mm plug in
HDMI Output: HDMI Support 4K 24HZ
DP Output: DP Support 4K 60HZ
VGA Output: VGA Support 1080P
Power: In: DC100-240V AC/50-60Hz Out: DC 19V/3.42A, 90W
Operating Temperature: 0°C~80°C (32°F~140°F)
Storage Temperature: -20°C~80°C (-68°F~176°F)
Relative Humidity: 10%~90% (non-condensing)
Thermal Design: Low Noise Fan
Size: 197 x 197 x 40mm
Weight: 1.5kgs
Package Content: Partaker B18 Mini PC, Power adapter, Power Cable, VESA Bracket, Screws

@pfaffman once these are swapped out I’ll mail you my old boxes, which are still trucking along just fine.

(Don Turrentine) #28

I hope you left the stubby WiFi antennae on, they are so cute! :heart_eyes:


(Jeff Atwood) #29

LOL no, I strip those out. These are going in a datacenter where WiFi would only be a negative.

(Jeff Atwood) #30

I got these boxes and they are nifty designs, a great step up in every way. Burning in overnight now.

This is a really nice, extremely compact layout! One thing that confused me is that the DIMMs are split: one goes on top half, the other goes on the bottom half.



You may notice in the second pic in each series, I removed stuff as well as installing memory and SSD. I pulled out the WiFi module and antennas, as well as the SATA bottom mount, since I don’t need it. It looks like there’s another full length NVMe port on the top side as well, which is amazing!

CPU benchmarks

:warning: this newer version of sysbench produces different numbers, it is not comparable to old versions!

sysbench cpu --cpu-max-prime=20000 run

DO droplet 2988
2017 scooter 4800
2019 scooter 5671

sysbench cpu --cpu-max-prime=40000 --num-threads=8 run

DO droplet 2200
2017 scooter 5588
2019 scooter 14604

Disk benchmarks

ioping -RD -w 10 .

iops MiB/sec
DO droplet 13.7k 53.4
2017 scooter 13.6k 53.2
2019 scooter 14.9k 58.0

dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
hdparm -Tt /dev/sda

sequential cached reads buffered reads
DO droplet 701 MB/sec 8818 MB/sec 471 MB/sec
2017 scooter 444 MB/sec 12564 MB/sec 505 MB/sec
2019 scooter 1.2 GB/sec 17919 MB/sec 3115 MB/sec

Discourse rebuild times

time ./launcher rebuild app

real user sys
DO droplet 6:59 1.4s 0.89s
2017 scooter 3:41 1.3s 0.85s
2019 scooter 3:24 1.7s 1.2s

Thanks for the tip on the timing command @pfaffman!

Users reporting lots of 502 errors when attempting to post due to "max consecutive replies" check
(AstonJ) #31

This looks like a fun topic!

Are these any good?

ioping -RD -w 10 .

474.9 MiB read, 12.6 k iops, 49.3 MiB/s

dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync

537 MB copied, 9.0306 s, 59.5 MB/s

sysbench cpu --cpu-max-prime=20000 run

General statistics:
    total time:                          10.0015s
    total number of events:              5289

sysbench cpu --cpu-max-prime=40000 --num-threads=8 run`

General statistics:
    total time:                          10.0044s
    total number of events:              13978

time ./launcher rebuild app

real 3m37.140s
user 0m2.268s
sys 0m0.740s
(Jeff Atwood) #32

No need to post a bunch of details, anything in the ballpark of the mini-PC numbers shown here is very good. As you can see from my 2019 update, you get to a point where even sizable increases in disk and cpu perf produce diminishing returns.

(Jeff Atwood) #33

Checking CPU scaling of the new box with i7z and stress. Reminder this is the i7-8750H with 6 cores, 12 threads, 45W TDP.

Here’s one task hogging CPU with stress --cpu 1, we definitely see 4100 Mhz as promised:

Starting with ~10w power consumption with i7z running but nothing else. Let’s exercise us some cores!

typical clock watts
stress --cpu 1 4.1 Ghz 30w
stress --cpu 2 4.1 Ghz 42w
stress --cpu 3 4.0 Ghz 53w
stress --cpu 4 3.9 Ghz 65w*
stress --cpu 5 3.7 Ghz 65w*
stress --cpu 6 3.5 Ghz 65w*
stress --cpu 12 3.3 Ghz 65w*

* It is likely more, you can really see the CPU speed limiters kick in because the watt meter will peak at 75w and then quickly dials it back down to exactly 65w.

Running current-ish versions of mprime jacks this up to 75w though, and overall clock scales down to 3.1 Ghz … those AVX2 extensions are :fire:

Power consumption seems to be about 8w totally idle at the Ubuntu login prompt. Not bad at all!

(Jeff Atwood) #34

Finally got this deployed so is live and running on the above hardware:

(Rafael dos Santos Silva) #35

How are the usual timings on the mini profiler for the latest and a topic page on the new cpu?

1 Like
(Jeff Atwood) #36

Not exactly apples to apples since I did not measure for the old box… but comparing the old numbers from 2017…

topic back button, 113ms avg (prev 206ms)
topic refresh, 179ms avg (prev 127ms)
latest refresh, 131ms avg (prev 140ms)

This is also unfortunately comparing 2017 Discourse with 2019 Discourse… it does look like we regressed a bit on topic refresh perf since then maybe?

(Jeff Atwood) #37

@pmusaraj recently improved our build yesterday to skip unused locale compression, here’s what I get with

time ./launcher rebuild app

now on the very same machine:

real user sys
2019 scooter 2:40 0.2s 0.1s

Amazing! :raised_hands::raised_hands::raised_hands::raised_hands::raised_hands::raised_hands::raised_hands::raised_hands::raised_hands::raised_hands: that is a 25% improvement!

(I rebuilt twice, because the timing of the first run is affected by the image download time which isn’t technically rebuild time.)

Do all the locale scripts have to run on install and upgrade?