Move from standalone container to separate web and data containers

Hi Jay @pfaffman Thanks for this post and others on this “two container” topic, including Sam’s writings on this as well.

Question:

We have been trying to set up two containers as you mention, one container for data and one for web-only and have been running into a number snags getting this running on macos.

But before we worry about debugging this “two container config” on the mac or Ubuntu, we would like to make sure we are doing this for the right reason.

The reason we want to do the “two container dance” is so the site will not go down when we rebuild the web app, for example when installing a plugin. Also, when we tweak a homegrown plugin; we noticed that sometimes the only way to insure our changes work is to rebuild (that is a story for another day) I’ve also been struggling getting a “fast and friendly” web dev setup going to my satisfaction as a well; but that is another topic for another day.

So, my question is that does the “two container” setup significantly minimize down time when the web-only part of the app is rebuilt?

That’s the right way to think of this, isn’t that right?

When we install a plugin or tweak one, we need to rebuild only the “web-only” yml file and not the data yml ?

We come from a LAMP forum background so changes to plugins can and are mostly done in runtime on the live site (with no down time unless we fat finger something). Also, we hail from some VueJS web apps where we build on the desktop and then we just upload and move the new app into place and there is virtually no down time upgrade / updating a VueJS part of the site. However, with Discourse we get downtime, which we do not want (even a few seconds).

Does the “two container” solution show significant improvements in downtime when we either (1) rebuild the app (for plugins, tweak code, etc) or (2) restore from a full backup?

I feel like I’m going to get “beat up” (again) for asking this question because we are looking for a way to run Discourse in production and make changes with near zero downtime, and we have not yet found a way to do things which are so easy to do with a LAMP or VueJS app (for example).

Hence, the struggle / interest in the “two container” method which we have yet to get up and running.

Thanks!

Yes. The existing web container continues to run while the new container is being built. The downtime, then, is just the time that it takes to spin up the new web server, which is typically under a minute, though by no means a zero-downtime proposition. If you want zero downtime you need a reverse proxy in front that will allow the new container to crank up and start working before you shut down the old one. (And if the database migrations for the new container break things for the old one then you get downtime there unless you go through some other machinations).

No difference on restore from backup.

4 Likes

Thank you Jay @pfaffman,

You are really a top-shelf valuable resource here, without a doubt!

What do you think about this maybe crazy idea (based on my still limited understanding):

Set up nginx as a reverse proxy on the front end; per this tutorial:

Then have two directories / instances with discourse_docker (standalone) set up, for example:

  1. /var/discourse1
  2. /var/discourse2

In both of these instances set up discourse_docker (standalone) to listen on a different socket, modifying this template in each instance:

 - "templates/web.socketed.template.yml"

So, in a nutshell, we have simply rebuilt production (at some quiet time) to run in a different container on listening on different socket (nginx.https.sock2), so there is no socket conflict; which we can build in standalone mode as well (with the goal of eliminating the need need for two containers, data and web-only).

For example (for discussion / illustration), in web.socketed.template.yml in discourse1:

  - replace:
     filename: "/etc/nginx/conf.d/discourse.conf"
     from: /listen 80;/
     to: |
       listen unix:/shared/nginx.http.sock;
       set_real_ip_from unix:;
  - replace:
     filename: "/etc/nginx/conf.d/discourse.conf"
     from: /listen 443 ssl http2;/
     to: |
       listen unix:/shared/nginx.https.sock ssl http2;
       set_real_ip_from unix:;

and in discourse2:

 - replace:
     filename: "/etc/nginx/conf.d/discourse.conf"
     from: /listen 80;/
     to: |
       listen unix:/shared/nginx.http.sock2;
       set_real_ip_from unix:;
  - replace:
     filename: "/etc/nginx/conf.d/discourse.conf"
     from: /listen 443 ssl http2;/
     to: |
       listen unix:/shared/nginx.https.sock2 ssl http2;
       set_real_ip_from unix:;

However, instead of having the discourse template do the magic, we simply manually
switch sockets in /etc/nginx/conf.d/discourse.conf and restart nginx, so we would remove the replace: directive in the web.socketed.template.yml template.

In this proposed (maybe crazy idea) configuration, we can have two standalone containers listening on two different sockets (not in conflict) and simply configure ngnix to connect to the socket we wish to connect to and restart nginx.

This seems clear, easy and perhaps useful (during a slow period of zero new posts in the live instance) for those who might not want (or need) the complexity two containers (data and web-only) per a single discourse instance (app)

Of course, the most robust configuration (from a data perspective) however, for perfection for busy sites would be the “two container” solution because we would want the data and the web-only instance (now listening on two different sockets, sock and sock2.

In the “two container solution” with the nginx front-end, the “standard configuration” is to have both web-only containers listen on the same socket, so both cannot run at the same time; but if (for example only) we had them listen on a different socket, they could both run at the same time and we could just use the nginx config file (and a nginx restart) to switch between the two.

Is this the right understanding?

Am I starting to (slowly but hopefully surely) understand this?

Thanks!

Followup Note Only: I have the “two container” config working on one of my desktop macs:

The only caveats in our install was the need to manually create these directories (and set ownership and perms) as these directories get not get created for some reason by the scripts:

~discourse/discourse/shared/data
~discourse/discourse/shared/web-only

and of course, at first I tried with a blank password for the database, and that did not work (the instructions do say to set a password, but I was just experimenting).

Next, will set up the nginx frontend and try the move to that configuration with websocket for the web-only app.

That’s a lot to take in. But there’s no reason to have two discourse directories. Just create multiple yml files in the containers directory. Name them whatever you want.

2 Likes

Thanks for confirming. That is exactly how we are set up (single directory) after experimenting today.

All has gone well on the 2 containers config (2CC) but struggling with the nginx reverse proxy setup on macos.

Cannot get a working connection to the unix domain socket in the /shared directory even though the socket is accessible outside the container. Tried with nginx and also python and socat (testing). Always a 61 connection refused error., hmmmm

Been stuck on “connection refused” all day!!

Tomorrow is another day.

I had a small question.
If we had just one container setup (just 'app.yml). And we commanded ./launcher bootstrap app.
Then would our website/front-end stop or not?

If yes, then why doesn’t ‘bootstrap web-only’ stop our website?

If no, then what is the benefit got in two-container setup, in regards to time-gain while rebuilding our container? In other words, if we can keep our website running, even when we’re bootstrapping our sole-container, then why would we need to have two separate containers?

1 Like

Not if you create a new .yml file, call it, for example, new_image

Bootstrapping does not start or stop any service, when configured correctly. Bootstrapped images are not running. That is why they are called “bootstrapped”.

However, you need to create a new yml file because you need to create a new image with a new image name.

So, with a new image name, the new bootstrapped image is not yet running. It’s is only “bootstrapped”.

You can build the new image with a new name (not app) and the building part is done. Let’s call this “new_image”.

Then, if we wanted to replace a running image, let’s call this “old_image”, you can do this:

./launcher stop old_image; ./launcher start new_image

In your case:

./launcher bootstrap new_image          #data image and web "app" image running
./launcher stop app; ./launcher start new_image

Because the data container is already built, you save time (1) not building the data image and (2) no down time when rebuilding the web image because you can build this (bootstrap the image) while the other images are running.

This is much faster; but it is not as fast as using a reverse proxy in front of the containers.

In your original message question, you have an image running called app. If you try to bootstrap that image called app again, then you are rebuilding the same image (name). This is not going to save you any time, as your instincts told you.

Is it clear yet?

If not, please ask. Everyone can learn; and learning is exciting. This is (actually) easy to understand but it does take a bit of time if you are new to the concepts.

2 Likes

To tell the truth, I didn’t understand anything. I tried. But failed.

My question was simple.
If we have 2 container setup, and we ‘bootstrap’ (not rebuild) our ‘web-only’ container, then our current web container keep working (because our website shows as working/ok).

And if we have single container setup, like ‘app.yml’, and then if we bootstrap this ‘app’ container, would our website still keep working?

And, if the explanation is simple, then why so?

1 Like

Maybe you should hire someone to help you?

1 Like

No problem.

It was just a small doubt. Whose one part, perhaps could be answered as ‘yes/no’ also.

Nothing big is there.

1 Like

My suggestion is to just enjoying trying it on your own.

It actually takes less time to try it on your own in your own testbed than to ask a question and wait for an answer in a forum.

You will get the answer to your question if you try it; and so you should try these things in a test scenario so you will not break your production app :slight_smile:

Discourse is open source and free to download and configure, thanks to the generosity of the co-founders. This means you can and should take advantage of this open source and freely create, destroy, recreate, and destroy test apps (as my times as you like).

If you are not willing to put some effort into basic sysadmin task, @codinghorror had a great suggestion about hiring the local talent here.

1 Like

Yes. (unless the bootstrap migrates the database in such a way that the running container can no longer use it). You do have down time while the old container so shut down and the new one is booting up.

No. You can’t have two database processes accessing the same files at the same time.

3 Likes

Oh!!
Thanks for clearing this thing.

1 Like

So, by having the database not in the container you’re building, you can build a new web container that acts on the database while the other web container continues to work.

1 Like

Quick doubt regarding this approach: How would one proceed with Rebuilds on this setup?

Assuming we go with the 2 container setup from scratch that @pfaffman added, then we have two “app”: “data” and “web_only”.

All the Rebuild ops and stuff are directed to the “web_only” one, but do we do anything with the “data” one or the “web_only” bootsrap takes care of it? (Just asking because I’m doing some tests and the data container never goes down at any moment).

We use “three containers” and a reverse nginx proxy:

  1. data
  2. socket1
  3. socket2

Let’s way we are currently running socket1 (what you call ‘web-only’) and we want to rebuild and test something. We will rebuild socket2 while socket1 runs. Because each of these containers use a unix domain socket, they can both run at the same time because the shared sockets are in different file (location).

Then, let’s say socket2 is built and ready to rock n’ roll.

I do this:

ln -sf /var/discourse/shared/socket2/nginx.http.sock /var/run/nginx.http.sock

Now we are running live on socket2.

If, for example, there was a problem, I can just do this:

ln -sf /var/discourse/shared/socket1/nginx.http.sock /var/run/nginx.http.sock

and we are back to where we were before, running on socket1.

For fun, I added some code to the bash login profile, so when I login to the server, it always tells me what socket (container) is running:

Last login: Fri May 15 09:39:39 2020 from 159.192.33.138
srw-rw---- 1 root docker  0 May  5 07:38 /var/run/docker.sock
lrwxrwxrwx 1 root root   44 May 15 09:16 /var/run/nginx.http.sock -> /var/discourse/shared/socket/nginx.http.sock

Hope this helps.

IMHO, this (in general) is the “way to go” (and is my “default standard”, in production not staging or dev), but that’s only me, YMMV. It’s does require a little bit more work to set up, but I think it’s well worth it (in production).


Caveat:


It’s a good idea to keep copies of your original template yml files somewhere safe, because the templates can get overwritten and this might be an issue, so we tend to “save all ymls”, “pull first” then “check ymls” and then bootstrap (out of an abundance of caution).

2 Likes

You don’t need to rebuild the data container unless there is an upgrade to postgres (as just happened) or redis (which happened in the last 6 months or so).

Mostly you just replace web_only for app in the instructions you see. I have notes at Managing a Two-Container Installation - Documentation - Literate Computing Support

5 Likes

So in this approach you have two web instances and use the idle one to test stuff with the same data? That’s interesting, will definitely try it out once I settle with this whole “trying to save space” strategy :stuck_out_tongue:

Thank you very much. Read your article end to end, just one small question, here you meant data, right? (like, rebuild data instead of web_only) as it says in the article? Or maybe there is a different trick I’m missing (ie: when there is a big upgrade you need to put them together and then separate again).

Yes, we test, add, rebuild on one socket-based container while running on the other.

Yes, both use the same data container. The data container does not “care” about which web application it “talks” to.

It’s very simple really, once you get the basic idea how to run the containers on shared unix domain sockets and not externally exposed TCP/IP ports.

We don’t find it takes much space, but then again, we don’t run (in production) with limited space because disk space is not expensive.

1 Like

Thanks for the detail. Just tried having two “app containers” pointing to a Data One on a test environment and it works like a charm.

I can’t move it to PRD, though, since I can’t rebuild the Data Container on PRD and I don’t want to change anything without having that sorted. It just stops the operations with a discourse@discourse FATAL: terminating connection due to administrator command