How do I set up multiple web containers?

(Iolo) #1

How do multiple web_only containers work together? I’m setting up a cold standby/failover situation and I have the database set up (with replication) and we can easily spin up a new redis container but I can’t find a good guide on how to set up the web containers.

Can I just periodically rsync over a shared/web folder and then use ./launcher start web on the second VM to start it up when needed or must I bootstrap it first then launch?

(Jens Maier) #2

Don’t just spin up a new Redis, instead run a Redis cluster alongside your replicated PostgreSQL. The data in Redis may not be critical, but in a failover scenario losing it would impact Discourse’s operation (e.g. some posts could not have their images lightboxed because the scheduled job was lost).

Regarding the static files, do research distributed filesystems. Don’t just rsyncing your uploaded files but let a DFS replicate /var/discourse/shared across the web worker VMs. :slight_smile:

(Iolo) #3

For Redis’s scheduled job, is there a way to manually run/check it if the need arises, maybe via rake?

If I replicate the whole shared folder (via glusterFS or similar), does this eliminate the need for bootstrapping the second container before I first run it? It feels like bootstrapping a distributed filesystem from both ends could go wrong somehow.

(Jens Maier) #4

You’re right, you wouldn’t replicate the whole shared folder, only the portion where Discourse stores uploaded files. Redis and PostgreSQL have their own replication schemes.

(In fact, messing with PostgreSQL’s files in any way while the database is running is a great way to destroy your data.)

(Iolo) #5

This sounds like my plan in the event of a failover should be:

  • Ensure all 3 master systems (psql, redis, ROR) are offline and not replicating
  • Bring up the slave redis and switch postgres slave to be the master
  • Bootstrap the web container and start it

The aim of the whole setup is 5 minutes max of data loss with 1 hour of turnaround time in the event of a failure. I feel like a 1 minute rsync of the uploads folder would be enough to meet these needs although gluster is also an option we’re looking at. We don’t need the high availability and load balancing as much as the data integrity.

(Jens Maier) #6

Ok, so if availability isn’t a concern, just separate your data from the default Discourse container setup.

Get two machines, install PostgreSQL and Redis and configure each as clusters. Get a third machine, install Docker and Discourse, let Discourse use the clustered PostgreSQL/Redis servers and run glusterfs or your rsync cron job to replicate uploaded files; that should do it.

If you want to save on hardware cost, you could move Discourse onto the same hardware that’s running the PostgreSQL replication master. Just keep PostgreSQL and Discourse well separated (i.e. don’t run PostgreSQL out of /var/discourse/shared/...).

(Kane York) #7

For real low-downtime deploys, you need a build server where you run launcher rebuid then use docker load/save to ship it off to your replicated webs.

That’s how you run multiple webs :stuck_out_tongue: