As the person who is in charge of all the infrastructure here at Civilized Discourse Construction Kit, Inc, for running things “at massive scale”, and someone who’s been doing ops for a long, long time, at a lot of places, let me add my two cents.
First up, I’d like to disagree with Jeff’s third point just a tiny bit: personally, and speaking entirely and only for myself, I think it would be inconsequential to CDCK’s business if there was a one-click “massive scale” Discourse installer. My belief is that the people who are willing to pay us, CDCK, for hosting, to support the development of the forum software and to get direct and high-priority access to the minds who know Discourse best, are almost entirely disjoint from the people who are absolutely committed to doing it themselves.
However, it simply isn’t possible to provide a one-click, massive-scale install option that will satisfy more than a tiny percentage of the user population. IT IS IMPOSSIBLE. That may seem like a bold claim, and even a self-serving one, but it’s the inexorable outcome of how the ops world currently works: everyone has their own preferences.
Take even the choice of “orchestration” layer. You suggest using Docker Compose. OK, but what about all those people out there who think Compose is a steaming pile, and wouldn’t touch it with a barge pole? They’ve got their own preferences as to orchestration, and they’d be as dissatisfied with a one-click massive-scale installer as you are with the current state of affairs.
Then there’s the other components in the core hosting infrastructure, like Redis and PostgreSQL. There are no shortage of Discourse users who run it on AWS, whose preference is to use Elasticache to provide Redis, and RDS for PostgreSQL. So our one-click, massive-scale installer would need to be able to account for that, in order to support people running on AWS. But we can’t just detect “oh, you’re running in AWS” and assume we should use Elasticache and RDS, because some people still prefer to run their own Redis and PostgreSQL. The same applies for “private cloud” deployments – some people have existing Redis/PostgreSQL tooling they’d prefer to use, while others would want us to set the data storage up for them.
Of course, massive scale is not valuable without monitoring, and there’s a myriad of options there, all of which we’d need to support, or again we’d be alienating a huge percentage of the potential userbase. Whatever we choose, there’d be a pitchfork-wielding band at the gates of the CDCK compound demanding we support their preferred monitoring system.
In theory, we could defer supporting a myriad of different systems to the open source community, but as you yourself note:
So we can’t really rely on the community to contribute large-scale engineering efforts like that… and we’re back to pitchforks at the gate.
But let’s say, for the sake of argument, that somehow we manage to code up support for everyone’s preferences into one neat little (ha!) package. Would it be a “one-click” massive-scale installer any more? Hell no! It’d be a maze of questions, somewhat like what building your own Linux kernel looks like. The damned thing would be nigh-impossible for anyone who isn’t deeply involved in the development of the system to be able to navigate without blowing their highly-available foot off.
All this highlights the stark reality of ops: running at scale is not easy. You need smart people who know all the aspects of the system they’re working in. There’s no way around that, and there’s no “one-click” installer that is ever going to solve that problem. Sure, aspects of the problem with be solved over time – Docker is as close to a pervasive solution to the “shoehorn a single software program into it’s own environment” problem, to the chagrin of some – and other parts will no doubt be standardised, but no matter what, running at the front of the pack will always involve a lot of skull-sweat and custom work.