Why does Discourse have to use Docker?

docker

(Maestro Magnifico) #1

Wanted to ask this right at beginning, but decided to figure out how stuff works before. So if I understand correctly, you run Ubuntu on server, on Ubuntu there’s Docker, on Docker there’s another Ubuntu, and on this Ubuntu there’s Discourse. That’s a lot of Ubuntus for me. Why don’t just install Discourse on first Ubuntu? Maybe Docker suppose to make Discourse install easy, but I don’t see it. I installed Discourse on Debian today with multisite-config to test it out. Although both sites shows “not found” for some reason, so I can’t use it, but Discourse service runs fine with 2 processes. And here’s memory usage to compare:

Ubuntu:

Debian:

So there’s 838 Mb difference. I’m writing this not to be a douche or something, I want to point out that there’s another services out there, like packager.io, that can be used to package stuff into easy installation, but not by creating “inseption” on your server. So why Docker? If I don’t understand something and there’s more profit from Docker than I see, please share.


(Sam Saffron) #2

memory can be controlled fine in your docker install, you can elect to run less web workers if you wish. no idea what packgr is doing, but it is super old and unmaintained.

the reason for docker, in a nutshell, is so we can have extreme levels of control.

we know the exact version of all our image manipulation libraries, nginx, postgres, redis etc. etc. this means that we can easily upgrade when needed and the same troubleshooting steps apply to everyone.

Docker has been a blessing to our project and proof of its success are the huge number of Discourse install that are running latest.

We were in the “roll your own deploy” movie for our first year, support was hell to put it mildly.


(Maestro Magnifico) #3

It’s not Discourse workers who disturbs me, it’s extra Ubuntu running on server that already have Ubuntu. I think this memory difference mainly because of this.

It just installs with software all libraries needed step by step.

Got it. Maybe in future you could add “developer” install of some sort and don’t provide any support for it? You could just provide list of libraries versions instead. I think, that’s all skilled people rly need (people not like me) to setup non-Docker installation and save extra 800 Mb of RAM.


(Sam Saffron) #4

no that has no impact, only disk space impact. memory usage is all us and our recommended setup.

in fact our docker install could quite easily have a penny pinching mode where memory usage is minimised and you pay the price with a slightly more jerky experience.


(Michael Downey) #5

Thank you very much for changing, I for one appreciate the difference. :stuck_out_tongue:


(Maestro Magnifico) #6

[quote=“sam, post:4, topic:26041”]
no that has no impact, only disk space impact.
[/quote]Wait, what? How does it work? Docker uses disk space instead of RAM? Сontainer content should run on something.


(Jacob Chapel) #7

Docker isn’t running multiple layers of Ubuntu. It is a container with Ubuntu software installed, but you are still running on the host OS. RAM wise, it is using the same amount as if you installed everything separately (maybe slight variations but not substantial). The main impact of Docker is the disk space usage because it uses a copy on write file system. Meaning every change to the file system appends the data somewhere new on the disk vs overwriting the existing content. This is mitigated by flattening and other means.

Needless to say, using Docker is a boon overall for Discourse and the tech community as a whole. You should really read up on it more, it is quite cool and much better than VMs for everything.


(Jeff Atwood) #8

Unlike a traditional virtual machine, LXC containers (like Docker) share the OS with the host. So they use a tiny fraction of the resources.

http://sp.parallels.com/fileadmin/media/hcap/pcs/documents/ParCloudStorage_Mini_WP_EN_042014.pdf

On the other hand, container virtualization (shown in figure 2), is virtualization at the operating system level, instead of the hardware level. So each of the guest operating systems shares the same kernel, and sometimes parts of the operating system, with the host. This enhanced sharing gives containers a great advantage in that they are leaner and smaller than hypervisor guests, simply because they’re sharing much more of the pieces with the host. It also gives them the huge advantage that the guest kernel is much more efficient about sharing resources between containers, because it sees the containers as simply resources to be managed.

An example: Container 1 and Container 2 open the same file, the host kernel opens the file and puts pages from it into the kernel page cache. These pages are then handed out to Container 1 and Container 2 as they are needed, and if both want to read the same position, they both get the same page. In the case of VM1 and VM2 doing the same thing, the host opens the file (creating pages in the host page cache) but then each of the kernels in VM1 and VM2 does the same thing, meaning if VM1 and VM2 read the same file, there are now three separate pages (one in the page caches of the host, VM1 and VM2 kernels) simply because they cannot share the page in the same way a container can. This advanced sharing of containers means that the density (number of containers of Virtual Machines you can run on the system) is up to three times higher in the container case as with the Hypervisor case.

The downside is that the isolation is not as complete, but performance and resource wise LXC containers are 3x or more efficient than virtual machines.


(Jacob Chapel) #9

Thanks for a more complete answer. :laughing:


(Jens Maier) #10

To be completely fair, there’s a few MBs of overhead since the container runs its own copy of whatever process control mechanism Ubuntu uses these days and I’m sure the kernel will use a couple KBs to manage its namespaces and cgroups… :wink:


(Tpokorra) #11

I beg to differ :smile:

I understand that you don’t want and don’t need to support it.

I just wanted to say that I run my two Discourse instances with the Ubuntu packages from Packages for pkgr/discourse, provided by Cyril Rohr, just fine for several months now. The upgrades have been provided within a day, I am now on latest stable, Discourse 1.2.0.

Since I run my websites each inside an LXC container, and I am more used to LXC than Docker (networking, understanding as a virtual machine rather than a virtualized application), I am glad I can install Discourse from a deb package on Ubuntu just as any other package.

The Discourse packages are available for Debian 7 and Ubuntu 12.04/14.04, and the installation instructions are helpful and easy to follow: eg Packages for pkgr/discourse


(Sam Saffron) #12

Glad to hear it is being maintained, however what I am unclear about with packgr is what its doing, is it using Unicorn like we do? Is it forking out sidekiq workers like we do? Does it do any Ruby GC tuning like we do? Is it using jemalloc?

Interestingly our docker based setup fits quite nicely in LXC as unlike many docker things out there we boot init so it would port quite cleanly.


(Maestro Magnifico) #13

I used this packages too at start, but for some reason multisite configuration doesn’t work on this packages, both sites shows “not found”. I even tried to replace all configs from official Docker install with working multisite setup - same result. So I had to switch to Docker setup.


(Florian Schmaus) #14

While I don’t think “using Docker” is an issue per-se, I’d like to point out that I run discourse since the early days on a standard Debian stable running on a cheap vServer that offers 2GB of RAM, on which I’m unable to run Docker. And since the early days of Discourse I’ve been updating my Discourse installation with a simple script (basically the one from [INSTALL-ubuntu.md][1])

update_discourse:

#!/usr/bin/env bash
set -e

cd /var/www/discourse
bluepill stop --no-privileged
bluepill quit --no-privileged

DATESTAMP=$(TZ=UTC date +%F-%T)
pg_dump --no-owner --clean discourse_prod | gzip -c > ~/discourse-db-$DATESTAMP.sql.gz
tar cfz ~/discourse-dir-$DATESTAMP.tar.gz -C /var/www discourse

git fetch
git fetch --tags
git checkout latest-release

bundle install --without test --deployment
RUBY_GC_MALLOC_LIMIT=90000000 RAILS_ENV=production bundle exec rake db:migrate
RUBY_GC_MALLOC_LIMIT=90000000 RAILS_ENV=production bundle exec rake assets:precompile

RUBY_GC_MALLOC_LIMIT=90000000 RAILS_ROOT=/var/www/discourse RAILS_ENV=production NUM_WEBS=2 /home/discourse/.rvm/bin/bootup_bluepill --no-privileged -c ~/.bluepill load /var/www/discourse/config/discourse.pill
```

I hope that Discourse encouraging using Docker and no [longer supporting the method described in INSTALL-ubuntu.md][2] does not affected the ability to run Discourse on a plain Debian installation.

@sam @codinghorror Please keep in mind that there people that can't (or won't) use Docker, and that are still using the approach described in INSTALL-ubuntu.md. While it's perfectly fine to not support it any more, please try to avoid breaking it in a way so that Discourse can eventually only be used via Docker.

I really like the status quo, that once my discourse installation tells me that a new version is available, all I've to do is to run `update_discourse` as the discourse user. :)


  [1]: https://github.com/discourse/discourse/blob/master/docs/INSTALL-ubuntu.md#updating-discourse
  [2]: https://github.com/discourse/discourse/blob/master/docs/INSTALL-ubuntu.md#warning-this-guide-is-deprecated

(Jens Maier) #15

As long as Discourse remains a Rails application, I can’t see this happening. Besides, Docker containers are not sealed boxes; you could always clone discourse_docker, look at the Dockerfiles and translate the commands into your non-docker environment. :slight_smile:


(Florian Schmaus) #16

\o/

I guess my main concern is that at some point in the future, Debian stable will not be sufficient to run Discourse.


(Jens Maier) #17

That, on the other hand, might actually happen. Debian is… a bit conservative when it comes to integrating new major verions into older releases of their distro, and according to the package search, anything newer than Ruby 1.9.3 is only available on a “testing” or “unstable” system.

And Ruby 1.9.3 is practically ancient:hushed:


(Jeff Atwood) #18

Debian is horrible. We don’t recommend it. You can’t even run Docker on it for the same reason.


Your Docker installation is not working correctly?
(Jens Maier) #19

For the record, I’m running Gentoo… no troubles there. Installing a Discourse dev environment is almost a oneliner, except that when you hit enter, your machine will be busy compiling god-only-knows-what for half an hour. :grin:


(Florian Schmaus) #20

As far as I can tell, the discourse installation procedure involves installing ruby as gem for the discourse user, therefore you don’t rely on the version packaged by the distribution. I guess that’s what makes it possible to run an up-to-date discourse on debian stable.