You mention IPMI. How does that compare to technology like Intel’s vPro (specifically AMT)? I’ve got experience with vPro and it was a fascinating technology, but have never heard of IPMI.
We’ve been researching things like that to provide a lights-out server where the nodes are shutdown when not in use, and were going to run with a raspberry pi to provide the VPN access. That TomatoUSB looks amazingly useful.
vPro includes a hardware KVM in a similar fashion, even remote viewing of the BIOS: YouTube
Jeff, other than the cost of the actual servers, what are the costs involved with setting up the actual rack. You have a router, then your servers, db servers, but what do you need to connect them all together like switches, power supply, etc…
Gigabit Ethernet hubs, cat6 cables, and a 1u power strip are all relatively inexpensive. I do recommend having two switches racked with one as a hot spare because if your switch dies, you are in big trouble!
Say a switch goes down, realistically how long would it take for you guys to drive down and fix it?
Also, with your experience running SO and SE, how often did you need to get physical access to the servers to fix a failed drive etc?
I’m just trying to compare things when going with something like ec2 or a managed dedicated box. Obviously you get all these benefits of simply buying a powerhouse server for $1.5-2.5K, not paying $50/month for 1GB of ram etc., but it also has some real issues when something does go wrong you have to drive down and diagnose the issue.
The datacenter, he.net does offer remote hands for $100/hour. So if it is very urgent I would call them, they would disconnect and reconnect all the network cables to the hot spare secondary switch in the same order. Pretty easy, since both our live and hot spare are the exact same switch and stacked right on top of each other.
If I had to drive down, it is about an hour to get there. (Berkeley to San Jose)
The main things that fail are hard drives and power supplies. Failure for new, burned-in server hardware is not that common… I never saw any failures at all for the ~10 servers we built up in the 3 years after deployed server hardware for Stack Exchange.
However, in my experience, while you are getting the servers initially set up and configured, you will need physical access a LOT in the beginning. Not because things are failing, but because you always forget something in the configuration. After racking the servers, plan for a few weeks of visiting the datacenter once a week. Once that is over, you’ll barely ever go back.
(and IPMI aka remote KVM-over-IP works amazingly well, you can reboot and edit the BIOS over the internet… as long as the server has power, it can be managed using IPMI which is basically a dedicated little ARM computer with its own networking inside the server.)
Do you know of any good write-ups/ tutorials where people outline exactly how they setup their co-location rack? A detailed account on exactly what they bought and tips and tricks etc.
@codinghorror I’m curious about your decision to configure your HAProxy servers, Tie Fighter 10 and 11, in a single chassis sharing one power supply. I understand having two HAProxy instances would allow for high-availability, but what about a scenario in which the power supply fails in that chassis? That seems to imply both servers will go down, and in your own words, “nothing will be accessible.” In choosing the Iris 1125, is downtime caused by PSU failure something you decided was acceptable? Or am I missing something from your configuration that makes this a non-issue?
We saw 20% to 40% performance loss running Discourse benchmarks under Xen and KVM on multiple servers. We tried and tried, and could not do better than that
So, maybe this is obvious, but did you make sure that the guest CPU configuration (in KVM) is the same as the host CPU configuration? This isn’t the default because it reduces portability (that is, live migration between different CPU types) but leaving it general can indeed cut performance by the percentages you’re talking about.
We have a spare PSU on-site in the cage (we actually have a few spare PSUs and SSDs in the cage, as mentioned in the article). So the time it would take me to drive down there, and install it, is acceptable versus likelihood of PSU failure.
How are those Samsung SSD disks doing? Have any burnt out yet? I used Jeff’s server blue prints for building a couple of db servers (SQL Server) but with the 840 pro disks. The performance was pretty damn good, I’m just wondering how long they’ll hold out
This depends entirely on the I/O rate on the disks, which depends entirely what you’re doing on that server. For “typical” server use, barring any random unlucky failures, I think it’s safe to expect ~3 years before I’d even remotely be worried.
However, it is a very good idea to get SSD disks much larger than what you need, so the drive has lots of space to reassign used-up cells. I would never, ever run a server with a 128GB drive that is always near capacity, for example. (Drives do reserve some space internally that you can’t use, but the more you have, the more “enterprisey” the SSD is because it is more tolerant of the most common SSD failure mode: worn out cells.)
Probably pretty good. Unfortunately they don’t support the SMART wear levelling indicator but I can get a fairly generic “sense” of how they’re doing as they are exposing a different attribute:
server ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
live db 177 Wear_Leveling_Count 0x0013 092 092 000 Pre-fail Always - 286
live db 177 Wear_Leveling_Count 0x0013 085 085 000 Pre-fail Always - 535
back db 177 Wear_Leveling_Count 0x0013 093 093 000 Pre-fail Always - 247
back db 177 Wear_Leveling_Count 0x0013 084 084 000 Pre-fail Always - 550
webonly 177 Wear_Leveling_Count 0x0013 099 099 000 Pre-fail Always - 12
webonly 177 Wear_Leveling_Count 0x0013 099 099 000 Pre-fail Always - 14
webonly 177 Wear_Leveling_Count 0x0013 099 099 000 Pre-fail Always - 11
webonly 177 Wear_Leveling_Count 0x0013 099 099 000 Pre-fail Always - 15
So the live and replica database servers have more wear on them. No surprise there. I should graph this.
There’s another excellent reason for this and that’s performance. For another customer, I was evaluating 128GB and 256GB “value” drives (i.e. not overprovisioned like the Enterprise drives) as replacements for the 50GB SSDs that reached end of life.
The overprovisioned 50GB SSDs gave you VERY consistent performance on a workload. You knew you were getting the IOPS and latency you needed:
Whereas the “Value” drives let you use all that space, but you have to manually enforce overprovisioning if you must avoid high write completion latency and maintain high IOPS:
Why aren’t you using NIC teams on all the servers and utilize the fact that you have two switches? You could just set it to be active/passive if you insist on only one switch being active at the time.
A single switch failure would mean zero seconds of downtime, instead of driving down there or rent hands at the colo.