Can Discourse ship frequent Docker images that do not need to be bootstrapped?


(Redundancy) #18

I figure I should try and explain my perspective, because this topic is basically at the root of me being very concerned about deploying Discourse in production and committing sysadmins to supporting it being highly available.

I think what worries me most is that there are assumptions, expectations, tooling, best practices and principles that generally apply to how people work with docker images, and you manage to break almost all of them. The system you have currently is surprising, which is rarely a good thing, and has added to the issue that I can’t assume anything you’re doing follows convention.

  • For a highly available system I need to set up Redis, Postgres and the web app in a way where they can be managed, monitored and preferably clustered.
  • For security, I really want to know how to make sure that the system is security patched regularly, locked down and only has the minimum required stuff installed. I need access to application logs, to know what’s in there and subscribe to relevant security advisories.
  • For quality, I want to know that dependencies like Redis which have slightly different behaviour when running multi-node are developed & tested in that configuration.

We have sysadmins, and to some degree the wrapping you have gets in the way of them being able to do their jobs effectively because they have to learn something new that’s not really got anything to do with those systems.

From my perspective, nobody should care if you have multiple docker images for each component of your service (this is not unusual or surprising), but we should care if you package everything into a single container, because it breaks things and behaves differently than the setup you recommend - the configuration of each service component has the ability to affect others. The fact that you say a multi-container setup is what people should be using points to this being how it should be developed too, and if you should be using an HA redis cluster in production, it’s great to be able to use the same setup in development.

Your “philosophy” agnostic design actually breaks the elements of the Docker philosophy that allow you to cluster applications in Docker Swarm, Tectonic, Kubernetes, Amazon ECS, Google’s Kubernetes, CoreOS… I just can’t buy the idea that you’re being philosophy agnostic when you’re breaking compatibility and standards.

I believe that your Ruby dependency leaks through to the host machine through “launch”. In the cases of the lightweight container OSes this means that you can’t use Discourse easily (since CoreOS doesn’t really support you installing things because it has no package manager).

When you say that launch performs tests, I would hope that you could do that by layering that on top of Compose (Compose the environment + a test container), or providing another image that exists only to run those tests. Launch can morph (in part) into a container that takes template files and produces config files (such as nginx config, shell script variables). You can make containers that exist only to run acceptance / integration tests, and a container can be as small as 10MB using golang.

Assuming that I understand what you pointed out about using a dockerfile, you noted that it creates a lot of layers. This can be true if you do a lot of lines, but I believe that if you simply bundle a shell script that wraps up a lot of smaller steps into a larger one and execute that, you only have a single layer. This method lacks the nice hashing on the line to determine when you’ve changed something, but I don’t think you’re really benefitting from that anyway. You could potentially version your source script by adding its content hash value to the filename, and then each change to it could force an automatic re-run of the transformation steps. See: How to Optimize Your Dockerfile | Tutum Blog

As I understand Docker Compose, service startup ordering is based on links and volumes, but can also be done by asking docker compose to start them up in the required order explicitly. I believe that what you need there comes down to a small batch script that orders the compose calls.

It’s not surprising (to me at least) that the upgrade process for Postgres does not run as part of the standard images. For one, I would be amazed if you couldn’t use the standard docker mechanisms to override the default CMD (Docker run reference | Docker Documentation), and secondly the postgres images provide documentation on how to extend them:

How to extend this image
If you would like to do additional initialization in an image derived from this one, add one or more *.sql or *.sh scripts under /docker-entrypoint-initdb.d (creating the directory if necessary). After the entrypoint calls initdb to create the default postgres user and database, it will run any *.sql files and source any *.sh scripts found in that directory to do further initialization before starting the service.

In a clustered high availability setup I would expect a DBA to be managing the database, or in the case of RDS, this might be handled by AWS themselves. A lot of work is typically expended on 0-downtime database migration, and the ability to run multiple versions of applications side by side and rollback, particularly around schema changes and data migration. To me, baking this process in to the bootstrap feels like it makes a lot of assumptions about the environment.

By not having a single process entrypoint be the way that you run your container, you miss out on other things like containers stopping when the program crashes, and things like the docker logging driver. That makes your application difficult to integrate with centralized logging if we want to, and more complex to monitor.

While this is really quite personal as a view, I tend to think of any batch script that’s longer than 10 or so lines (and has logic in it) to be a sign of something bad - at that point you should use a scripting language. A 650 line bash script is practically a harbinger of the apocalypse.

For configuration management, it feels entirely wrong to be discussing Chef and Puppet (that’s just in this thread though), since these are generally speaking tools for reaching in and changing VMs. There is a reason that CoreOS and others are using etcd / consul, and it’s largely that for high availability clusters, services may have to subscribe to and change config on the fly. Both also support distributed locks.

A number of your templates are actually interactions with the nginx configuration. The normal way of running ruby in a container (iirc) is not to use nginx (which you would on a VM to get a decent level of parallelism) but to just use more containers. Pulling Nginx out seems like a good idea, especially since there’s a template for SSL configuration in there - if it’s your front facing layer it’s even more important for sysadmins to be able to configure it and ensure it’s patched separately and preferably network segmented away from direct DB access. We may be load balancing servers anyway using ELBs, HAProxy or something else, and while it’s nice to have a starter nginx config generated, I want my sysadmins to be able to manage it without learning your templating system or re-bootstrapping the application.

Decomposing the service components may cause a lot of the templating shrink, because you’re not needing to mess around with a significant amount of configuration on a single machine, and a lot of these things can be handled by much smaller modifications and extensions of standard images (thus removing a lot of the installation, and cleaning up what things are requirements for what). The overview of your application becomes much simpler, as the service architecture becomes clearer.

Note that none of this should preclude providing a simple “out of the box” configuration for developers, small installations and trials, but it becomes easier for sysadmins to manage as needed. (See: How to use Docker Compose to run complex multi container apps on your Raspberry Pi · Docker Pirates ARMed with explosive stuff). Fixing your bootstrap to just produce an image totally answers the “how do I make sure the same thing is running on multiple machines”, because that’s exactly what having a docker image is supposed to give you, by relying on version tags or a local private repository.

I hope that I’ve done a semi-reasonable job of explaining why I find some of the choices concerning, and why they potentially make it more difficult than it needs to be for our teams to take responsibility for and own a deployment of Discourse. I really want to like it, because it means we don’t have to build or fork a forum solution to make it fit, but I also have to consider the cost of operation and SLAs we can give.


(Jeff Atwood) #19

Great feedback!

To be brutally honest, we don’t optimize for this in our open source facing installs because:

  1. The vast, overwhelming majority of open source Discourse installs are small or medium size and easily covered by the all-in-one container solution.

  2. Even a larger install is comfortably covered by a fairly simple isolated web / data container setup, so you can update / restart without interrupting service.

  • Our hosting business involves running Discourse at massive scale; while we are obviously in no way opposed to others running Discourse hosting businesses themselves, or otherwise hosting Discourse at massive scale, it does not exactly align with our business interests to make it one-button easy for someone to click a button and scale Discourse to a billion active users on a single site.

So there’s very little incentive for us to improve case #3, massive scale, because a) few need it and b) it’s a major source of income and survival for our business in the form of our enterprise hosting plan, where we are literally selling deep knowledge of Discourse as we wrote it in the first place.

That said, if you’d like to contribute effort towards making the open source story for case #3 better, we’re completely open to that.


(Redundancy) #20

An appropriate support contract priced competitively against running multiple enterprise level sites could be an option that creates a business incentive for you (above and beyond that I think it might make your own business operations more efficient). There are reasons like branding that companies might want to do that, but be put off by the potential cost of multiple hosted sites, and it’s a tried and true on-prem business model.

In terms of contributing, to be brutally honest, it would have to be a choice based on on business priority and the cost of developer time ;). Getting familiar enough with the codebase to be able to make good choices and know their consequences (thereby having a good chance of having the work pulled back to mainline) rather than give general advice would require an investment.


(Jeff Atwood) #21

Sure, I’ve always estimated that engineers cost $10,000 per person per month – but as you can read here some people say it should be higher:

A rule of thumb is that an engineer (the most common early employee for Silicon Valley startups) costs all-in about $15k per month

  • If it takes two engineers four months to build out a big Discourse installation, that means it costs the company between $80,000 and $120,000 just to get Discourse deployed and configured and stable at high volume.

  • If we assume that, after the initial setup period, ongoing maintenance is perhaps 3 days of engineering time per month for those same two engineers, assuming equivalent hourly rates, that’s between $3,000 per month and $3,750 per month ongoing. Maybe a bit less, but certainly that’s a reasonable estimate.

This makes our enterprise hosting plan – where we guarantee high availability, ultra-fast speeds (with CDN), and super redundancy for extreme scale – seem like quite a bargain at $1.5k/month.

(We also have an emerging enterprise VIP tier for clients that need more ongoing custom development work.)


(Pierreozoux) #22

Ok, let’s go back to earth :slightly_smiling:
Based on this, I did this docker-compose.

This should be easier to scale than the original one and follow docker principles (although, it can be better engineered).

It is just a proof of concept for now, but as an IndieHoster, I’m selling the hosting service, and I’m comited to maintain this.

Let me know if you need any help to get started with that!

All the best!

Pierre


(Matt Palmer) #23

As the person who is in charge of all the infrastructure here at Civilized Discourse Construction Kit, Inc, for running things “at massive scale”, and someone who’s been doing ops for a long, long time, at a lot of places, let me add my two cents.

First up, I’d like to disagree with Jeff’s third point just a tiny bit: personally, and speaking entirely and only for myself, I think it would be inconsequential to CDCK’s business if there was a one-click “massive scale” Discourse installer. My belief is that the people who are willing to pay us, CDCK, for hosting, to support the development of the forum software and to get direct and high-priority access to the minds who know Discourse best, are almost entirely disjoint from the people who are absolutely committed to doing it themselves.

However, it simply isn’t possible to provide a one-click, massive-scale install option that will satisfy more than a tiny percentage of the user population. IT IS IMPOSSIBLE. That may seem like a bold claim, and even a self-serving one, but it’s the inexorable outcome of how the ops world currently works: everyone has their own preferences.

Take even the choice of “orchestration” layer. You suggest using Docker Compose. OK, but what about all those people out there who think Compose is a steaming pile, and wouldn’t touch it with a barge pole? They’ve got their own preferences as to orchestration, and they’d be as dissatisfied with a one-click massive-scale installer as you are with the current state of affairs.

Then there’s the other components in the core hosting infrastructure, like Redis and PostgreSQL. There are no shortage of Discourse users who run it on AWS, whose preference is to use Elasticache to provide Redis, and RDS for PostgreSQL. So our one-click, massive-scale installer would need to be able to account for that, in order to support people running on AWS. But we can’t just detect “oh, you’re running in AWS” and assume we should use Elasticache and RDS, because some people still prefer to run their own Redis and PostgreSQL. The same applies for “private cloud” deployments – some people have existing Redis/PostgreSQL tooling they’d prefer to use, while others would want us to set the data storage up for them.

Of course, massive scale is not valuable without monitoring, and there’s a myriad of options there, all of which we’d need to support, or again we’d be alienating a huge percentage of the potential userbase. Whatever we choose, there’d be a pitchfork-wielding band at the gates of the CDCK compound demanding we support their preferred monitoring system.

In theory, we could defer supporting a myriad of different systems to the open source community, but as you yourself note:

So we can’t really rely on the community to contribute large-scale engineering efforts like that… and we’re back to pitchforks at the gate.

But let’s say, for the sake of argument, that somehow we manage to code up support for everyone’s preferences into one neat little (ha!) package. Would it be a “one-click” massive-scale installer any more? Hell no! It’d be a maze of questions, somewhat like what building your own Linux kernel looks like. The damned thing would be nigh-impossible for anyone who isn’t deeply involved in the development of the system to be able to navigate without blowing their highly-available foot off.

All this highlights the stark reality of ops: running at scale is not easy. You need smart people who know all the aspects of the system they’re working in. There’s no way around that, and there’s no “one-click” installer that is ever going to solve that problem. Sure, aspects of the problem with be solved over time – Docker is as close to a pervasive solution to the “shoehorn a single software program into it’s own environment” problem, to the chagrin of some – and other parts will no doubt be standardised, but no matter what, running at the front of the pack will always involve a lot of skull-sweat and custom work.


(Redundancy) #24

I don’t feel like you read and understood my post. I’m really not asking for a one-click high availability solution.


(Sam Saffron) #25

I am very confused what exactly are you asking for? Can you provide 5 one sentence bullet points?


(Sam Saffron) #26

There is a lot of knowledge in our base image, we install particular versions of ImageMagick, pngcrush etc. We install a particular version of Ruby and jemalloc.

When it comes to running Discourse we are particular about using unicorn and forking out sidekiq to conserve memory.

Nothing is stopping you building images base on our base images and then launching them with composer if that is how you roll.

But there is a lot of knowledge baked into our process you are throwing away if you simply put your hands up and say “you are doing it wrong” and go and start from scratch.

On topic of compose:

  • How are you going to run asset precompilation and db migration?
  • How are you going to allow people to run plugins?
  • How are you going to ensure people don’t have 700gigs of log files that are not rotated?
  • How are you going to provide your users with a 1 click upgrade from a web page?

Using a launcher built docker image in docker-compose
Using a launcher built docker image in docker-compose
(Sam Saffron) #27

Nothing is stopping you running Redis, Postgres and web using our base images in a way that can be orchestrated, just bootstrap an image and push it to your docker repo.

Nothing is stopping you bootsrapping your images and then running all the security patches, and then pushing to your repo.

I am completely unclear on what you are saying here. We accept configuration that tells us where redis is via an ENV var. You can do what you will.

Nothing leaks through, to run launcher all you need is bash, once stuff is bootstrapped you have images that you can use as you will anywhere.


(Sam Saffron) #28

How exactly is try_files going to work? now we would need a data volume for our precompiled assets that gets somehow shared between 2 images.

And a much more practical question, why does somebody who got a $10 install on digital ocean care about any of this?

All they want is a simple mechanism for installing Discourse that is secure, robust, will not chew up all their disk space with logs and offers 1 click upgrade. This hangup on “though shalt only use tool XYZ” is not really helping them.


(Jeff Atwood) #29

Might be more practical to focus on these, I know #1 is coming because Meta is running on PG 9.5 right now, as well as at least one other new customer…

I think refinements to the current process are more interesting and useful than a “start everything over from scratch” mentality.


(Matt Palmer) #30

Or, since this is entirely open source software we’re dealing with here, someone can demonstrate how wrong we are, by repackaging Discourse in a more minimalist container. I’d love to see different ways of packaging complex webapps like Discourse, but so far, despite several people talking about it, I haven’t noticed a single alternate approach get a lot of traction (the fact that multiple people have published separate docker-compose recipes seems to suggest that there isn’t a single “best practice”, universally-applicable approach to using that tool).

As Sam suggested, then, it would be nice if you could provide a summary. I did read your post, and as far as I could tell, it was asking for exactly that sort of a thing, with HA Redis, PostgreSQL, and all the rest “baked in”:

Also, one thing I forgot to mention previously, in specific response to:

Nothing needs to be fixed in the bootstrap process, because that is, in fact, exactly what ./launcher bootstrap will do for you – in fact, that’s exactly the command we use to build the images that run our internal hosting environments.


(Oleg Bovykin) #31

Hi! Thanks for your great work!
As I see now, problem with current infrastructure is that it brakes docker rules, like immutable containers and one process per container.

  1. Immutable containers. As I read in topic, its almost impossible to make immutable containers (for app servers), because it will brake a lot of things, lets skip it, we cant do anything right now.

  2. One process per container. I think its very easy to remove templates from launcher and add single template, where each part of discourse running in separate container (nginx, pg, redis, unicorn, sidekiq and so on) with only one process per container. As I see, there are no problems with separating things. And you get isolated services to manage. In all terms its better, than have all-in-one container (or almost-all-in-one). Also, if things are separated, you have potential to scale: run multiple unicorn containers easyly. With latest version of docker its possible to run multi-server docker swarm and scale well. After this step, simply add service discovery(consul, registrator and consul-template) and you have nice system. What do you think?


(Matt Palmer) #32

Can you define that term precisely? The containers built by launcher are certainly “immutable” in the sense that the container image itself doesn’t change during execution (all stateful information is held in external volumes) and more instances of a container can be easily spawned on separate machines. Other than that, I don’t know what you could be referring to.

I think someone should definitely build that. I don’t think anyone at CDCK will be taking the lead on it, though, because the current arrangement seems to work well for the target audience of the all-in-one container: small, standalone sites without any existing infrastructure to integrate with. There are higher priorities for the dev team than the work required to support the much smaller percentage of Discourse users that:

  1. Need a larger-scale setup;
  2. Have a common, well-defined, existing infrastructure to plug into; and
  3. Don’t have the skills and knowledge to do the needful themselves.

As Jeff has already said, community contributions in this area would be welcomed, but so far, despite a number of people saying they’ll do something, not a lot of results have been forthcoming which have met with wide acclaim from others.


(Sam Saffron) #33

Pg and redis run fine now in single containers, we have samples for that.

I am not particularly happy about the prospect of splitting nginx and unicorn, cause they are coupled tightly, nginx serves static files directly, so you would need to add one more shared volume between nginx and web that adds complexity for dubious gain.

Unicorn master forks off sidekiq (and web) this means that sidekiq gets to share memory with the parent process. If you split the two services you waste memory.

Memory is our #1 issue we have on our digital ocean installs, we can not afford to regress here.


(Oleg Bovykin) #34

You are absolutely right. Files in containers dont change, except for shared volumes. Best way is to serve ready containers with precompiled assets and so on. But thats not possible, please read full phrase.[quote=“arrowcircle, post:31, topic:33205”]
As I read in topic, its almost impossible to make immutable containers (for app servers), because it will brake a lot of things, lets skip it, we cant do anything right now.
[/quote]

I think people dont want to do anything after very aggressive discussions. Its very easy to brake motivation of open source contributors. Maybe thats why there are only few pull requests coming not from dev team.

Nginx already serves files from shared volume, there are no problems with separating nginx to container and adding same shared volume. You have part of the services running each in separate container, and some services are joined. Why not to finish separation process and make one multicontainer template?

Wow, nice solution. Is there a situation, when you need to scale sidekiq and dont scale unicorns? In this case, how much memory you spend on sidekiq, thats not needed? Ugly and simple solution for this - is to add env switch not to run sidekiq with unicorn. Unfortunately, I think no one ever will need this.

Do you have a strategy to deal with it?


(Sam Saffron) #35

It sure does, it serves the uploads directory, it does not serve the public directory where all the minified css and js lives. As a rule we do not keep data that changes every build in volumes, cause then we need to worry about garbage collection.


(Sam Saffron) #36

Number of unicorn workers and sidekiq workers can be controlled via env vars


(Oleg Bovykin) #37

As I see, all easy steps to separation are done? Further steps will need a lot of work to make new build system.