Can Discourse ship frequent Docker images that do not need to be bootstrapped?

Stephen · October 19, 2018, 8:11am

Rebuilds are far from a common or regular practice. Most upgrades can be performed from /admin/upgrade

And for sites where any server error is a problem there are guides on how to present a holding page during the upgrade process.

ChrisBeach · October 19, 2018, 8:15am

I have to perform rebuilds when I add or remove plugins, and when they are mandated by the occasional upgrade (several times a year). But yes, it’s not a showstopper for a local community forum like mine, and /admin/upgrade works very well for regular updates.

I present a holding page during my rebuild upgrades for the benefit of users, but how would the Googlebot perceive this? To the bot, my homepage suddenly has no intra-site links, and all the pages the bot has previously indexed have their content replaced with the holding page.

Stephen · October 19, 2018, 8:20am

If you’re catching all requests with a 302 to show the holding page there should be zero impact. Google confirmed a couple of years back that redirects don’t impact SEO/pagerank:

Gary is the “House Elf and Chief of Sunshine and Happiness at Google” (Webmaster Trends Analyst)

If you’ve just rebranded the 502 error page then yes, it will have an impact.

pfaffman · October 19, 2018, 10:09am

What you don’t understand is that most of the people deploying discourse don’t know what docker is, much less care about best practices.

If you want a docker image, just use pups to generate it yourself. Is that what you guys do, @sam? Do create a separate container image for each enterprise client with their plugins?

If you want to avoid downtime when you need to rebuild, make a 2 container configuration. You can search here or look in discourse-setup for a command line option that will build one for you. Then downtime is limited to the time it takes for the new container to boot up.

If you want zero downtime, configure haproxy with multiple web containers.

gkoerk · October 19, 2018, 2:46pm

There’s a very large number of people running their own servers who are looking for this exactly. I know personally dozens of hobbyists who are looking for a forum which is truly self-hosted, not on a droplet or other VPS. Many self-hosting enthusiasts run multiple apps (think Nextcloud, Plex, Wikis, CMS/Blogs, etc.), whether they actually host them or run them on a VPS. Here’s my situation: I’m running docker swarm for dozens of apps. It seems to me at least that the way the tool works now it can’t be incorporated into a docker swarm with Traefik or HAProxy reverse-proxying requests.

Maybe (hopefully!) I’m mistaken and you can explain or point me to instructions on how to take your final, single completed container, reference it in a new docker-compose and run it in a swarm?

gkoerk · October 19, 2018, 2:56pm

I understand that. But the developers of Discourse do, and they depend on it for their deployments. I apologize, I don’t know what pups is. Looks like it’s related to Ansible, which I don’t know.

Debt in the deployment management code & process. It is very complicated right now with lots of moving parts and things that are difficult to understand and support. One of docker’s primary draws is the encapsulation of prerequisite code and builds which complete prerequisite work with less risk of a user messing something up, especially if the whole thing is ultimately wrapped in a script to create/upgrade the installation. That gives those who don’t (and don’t need to) understand docker well a good solution, and gives engineers or “hobbyists” a solution as they could skip the wrapper and compose things the way they want.

One of the things I think overlooked here is that a major reason for self-hosting is control over data and backups, etc. Docker makes this especially convenient since you can bind volumes and back those up, and even run a container in the stack whose role is backing up the DB to a flat file in the backup location. When you can’t self-host it along with other docker apps, it effectively means you cannot self-host on your server, you need to purchase a VPS just for Discourse, rather than reusing the same hardware and ecosystem that works for the vast majority of apps which use docker in their deployment.

gkoerk · October 19, 2018, 3:28pm

By way of disclaimer, I am not employed or compensated by any hardware vendor, Saas provider, or forum software developers.

Think about how many potentially unnecessary Digital Ocean, Linode, and VULTR subscriptions Discourse has been responsible for launching. Then consider that there are companies making a revenue stream hosting Discourse for others in part because it is too complex for them:

The Discourse forum software is fantastic, but quite hard to install and host it yourself. We think it’s too great a product to be limited to a technical audience only.

Then again, the way you’ve modeled your revenue-stream, it makes zero sense making things easier to install and run in simple to use containers which expose only necessary ports and bind mounts, using environment variables for everything else so that deployment is as simple as docker stack deploy or docker-compose up. Like I said - maybe and hopefully I’m the one mistaken, and there’s a way to take the final docker container and deploy it in a swarm or other compose environment with other apps and a reverse proxy.

gkoerk · October 19, 2018, 3:37pm

This is exactly the point many folks have been making in this lengthy thread: Is there a solution for those who do know what nano is and can exit VI / VIM just fine? I trust you know your customer base better than I, but I have to imagine that such basic knowledge of Linux is the case for the overwhelming majority of those wanting to self-host open-source software on Linux.

pfaffman · October 19, 2018, 4:19pm

Yes. Create a web_only.yml (there is a sample in discourse_docker) and use that to build your docker image with your plugins. Then add it to your swarm. Everything can be configured via environment variables; you can see them in the last line when you to a ./launcher start web_only. It’s not that hard, but you’re not going to get free support here to help you figure it out (and it’s not just you but a whole bunch of people with a zillion different definitions of Best Practices who would need much, much more help than you would).

I can probably help you figure out what you need to know in an hour or two of my time.

fantasticfears · January 12, 2019, 7:00pm

Recently I found the commands to fast forward a git clone depth 1:

git checkout --detach #avoid tangle with git tree state
git fetch --depth 1 -f -origin [branch|commit]
git checkout -f [branch|commit]

Thanks to archive.org team.

pfaffman · May 9, 2019, 7:51pm

So would running ./launcher bootstrap mybase that had the bundle exec rake db:migrate; bundle exec rake assets:precompile added to the init script do something like that? Just run it against a test database, or maybe strip those rake tasks out of web.template.yml?

EDIT: Found this comment:

  # this allows us to bootstrap on one machine and then run on another

which might answer my question.

codinghorror · August 17, 2019, 8:44am

A post was split to a new topic: Thank you for creating a polished, performance tweaked product that can be deployed simply

pfaffman · January 2, 2020, 6:50pm

Does that mean that you still bootstrap a new image for each deploy that is created from the base image plus the changes that are done by the precompile assets or do you just use the single image and have it precompile when it’s cranked up?

sam · January 3, 2020, 12:18am

We technically pre-compile twice. Once for base image + our changes. Second time when we deploy specific sites with specific plugins.

Our setup though is not something you would do unless you were hosting Discourse as a business.

AquaL1te · December 2, 2020, 2:39pm

I would really like to use Discourse. But the way these devs are using container technologie really stops me from using it. I have never saw a project that is able to make a container installation non-portable like this. There are much better ways to fix this. To me it seems like this project organically worked towards containers but doesn’t really know how they should be used. And now the project is stuck with this overly complex and non-portable solution. And probably time restrictions won’t allow to create a proper portable container.

Please, provide a Dockerfile that builds a towards consistent state. Document the environment variables people should use to manipulate the container behavior. And simply use an entrypoint script to make sure everything is started in runtime. Combining containers can be done with Docker compose or Kubernetes. But not like this, this is really mehhh.

I try to run this on e.g. Fedora CoreOS, CentOS 8 or Fedora 32. But it’s not possible. The container standards certainly would allow it. But this is a fine tailored way to basically only support Ubuntu LTS. While some people have already moved away from Docker (especially when running a service as root). But containers are standardized, so people may decide to use Docker, or Podman. But this Frankenstein would make that very hard, which goes against the concept of containers and modern container security.

pfaffman · December 2, 2020, 11:12pm

Try looking around in 2014. When the project started, docker-compose wasn’t really a thing.

So true! Maybe you can fix it. All you need to do is change everything about how to build and launch a container and see that it doesn’t break thousands of functioning sites.

OR, make your containers that would allow you to launch with whatever tool that people who know how containers should be used will like. Bitnami seems to have done it. You can search here and find lots of people who have had trouble and nowhere to go for help.

AquaL1te · December 3, 2020, 9:11am

I know it’s not an easy thing to do. But the purpose of containers is to have a consistent and portable state. So if containers are used as they should be, and it works for you, then there is a fairly high chance it will work for everyone that uses that container.

If the bootstrap alone would be moved inside the container, rather on the host, you would already come quite far in making it portable. I can have a look after I’ve finished other projects. I’m no container expert either, but I’ve build a few. The downside however, is that there is no installation documentation available, right? It’s just; here you go, just run this script. I can try to replicate what the script does. But that doesn’t leave much room for improvement suggestions.

So if the community, especially the people closely involved and have inside information how the installation should work are willing to advice/help out, then I’m willing to start this initiative. Otherwise the quality won’t be what you want to see.

The goals would more or less be:

A Dockerfile which has an atomic build of the setup (no local bootstrap outside the container)
No need to run the container as root, best is to use fakeroot and add capabilities (these are command line arguments, people can still chose to start a container as root…)
Create an entrypoint script that can be influenced by environment variables, which have to be clearly documented
podman-generate-systemd or something similar can be used to create a systemd unit to (re)start a container, or start a container at boot (Podman feature, maybe docker has something similar, but is more about making this integrated)

That would be for the basic install. For the scalable solution a docker-compose and Kubernetes solution is needed. Which I frankly don’t find the responsibility of the Discourse community to find a one size fits all solution. Because these things can be very fine tailored, especially on Kubernetes. So I guess a basic compose solution would be sufficient to get people up to speed.

This will provide a portable and more secure solution. Increasing the adoption and quality overall. In the meantime I’ll see if Discourse is really something I need for my community. If I do, then I’ll use an Ubuntu LTS system for now. Once I have more time, I’ll invest time in such a setup.

akvadrako · December 3, 2020, 9:43am

Hello @AquaL1te,

I have done some of what you are suggesting. It can use podman, k8s or docker. Rootless is supported (not fakeroot). It uses a 4 container setup with nginx, sidekiq, redis and the ruby daemon.

In general, the developer install docs can be followed.

One thing to keep in mind is there are some tricky aspects around how discourse does assets.

AquaL1te · December 3, 2020, 11:48am

Ah, yes. I was confused with the terminology, I’ve been using Singularity a lot lately which uses a --fakeroot flag.

I’ll have a look, it sounds very nice! Because I prefer to use Fedora CoreOS for this, to max out the security potential of a containerized setup. This is not easily possible at the moment in the current setup.

It still bugs me a that the main solution is non-portable and there are no signs towards a modern solution. I’ll have a more detailed look at your setup and maybe in the future I’ll contact you to co-maintain it. If needed of course. Thanks for your work and suggestions!

cron · April 27, 2021, 7:01am

I just finished reading this entire beast-of-a-thread and especially resonated with the needs of Gkoerk who commented above Oct. 19, 2018 seeking help for self-hosting discourse in a swarm – apparently for inclusion in his mother-of-all docker-swarm-cookbook. Wow. What a collection! Gkoerk reportedly passed away at the young age of 40. Damn. Seemed like a genuinely great guy and contributor.

Topic		Replies	Views
Using a launcher built docker image in docker-compose Dev	30	7488	September 27, 2024
Container restart - discourse tries to reinstall Installation unsupported-install	24	2763	July 30, 2020
Installing on Kubernetes Installation unsupported-install	51	23902	December 9, 2024
Community supported official docker image Dev	35	8307	July 22, 2022
Bootstrap app container in 2 steps Feature	5	2063	January 2, 2020

Can Discourse ship frequent Docker images that do not need to be bootstrapped?

Related topics