Hoe voer je groot onderhoud aan Discourse uit met minimale downtime?

emonunix · 5 september 2025 om 20:27

Ik wil graag een discussie openen over de best practices voor het uitvoeren van kernonderhoudstaken op een Discourse-instantie met minimale of geen downtime.

Taken zoals het wijzigen van kritieke resource-instellingen (bijv. UNICORN_WORKERS, DISCOURSE_SIDEKIQ_WORKERS, DISCOURSE_DB_POOL) of het toepassen van grote updates vereisen doorgaans een launcher rebuild app, wat aanzienlijke tijd kan duren, soms 30 minuten of langer.

Mijn vraag is:
Wat zijn de aanbevolen strategieën voor systeembeheerders om deze essentiële updates en configuratiewijzigingen met de minste gebruikersgerichte downtime uit te voeren?

Zijn er geavanceerde technieken, zoals blue/green deployments of andere zero-downtime deploymentstrategieën, die worden ondersteund of aanbevolen voor Discourse? Of is het standaard rebuild-proces de enige ondersteunde methode, en moet de focus liggen op het optimaliseren van de rebuild-tijd zelf?

Ik ben geïnteresseerd in de ervaringen van iedereen die grote of drukbezochte instanties beheert en hoe hun workflow voor onderhoud eruitziet.

Bedankt voor alle inzichten!

pfaffman · 5 september 2025 om 20:39

If you have a two container install, the new container builds while the old one runs. Downtime is just the amount of time it takes to launch the new container. The only issue is that you need enough ram to build a container while the other runs.

Move from standalone container to separate web and data containers, but I usually move a new vm.

If you want zero down time then you need a load balancer that keeps the old container running until the new one has fully started. Then you shut down the old container and do the post update migrations.

Ethsim2 · 5 september 2025 om 21:05

can you have two data containers on failover?

do you use a usually have separate vm for data?

merefield · 5 september 2025 om 21:53

Discourse is so stable this is pretty unnecessary for most installs (but I guess you might consider it for very high availability requirements or if you are hosting others?!)

I don’t think I’ve had a single outage in 7 years due to a production “glitch” …

The riskiest moments in a Discourse’s life is always at rebuild.

the two container setup gives you the ability to bootstrap a new build before committing to it though that won’t catch some runtime errors of course.

The issue is that if your migrations have run, you might need to commit to the new build and so you would usually try to track down and fix the source of those errors rather than roll back.

Generally people do not try to roll back …

pfaffman · 5 september 2025 om 22:47

Ik verhuis naar een nieuwe VM bij een grote herconfiguratie.

Het is mogelijk om een PostgreSQL-spiegel te draaien, maar het is veel werk.

itsbhanusharma · 5 september 2025 om 23:42

Een read replica zou beter zijn, nietwaar?

pfaffman · 6 september 2025 om 00:41

Ja! Replica! Dat is het woord dat ze gebruiken. En dan, als de andere sterft, kun je overschakelen naar de replica.

Topic		Antwoorden	Weergaven
Help with "zero downtime" setup Self-hosting hosting	7	2294	10 september 2020
How to install a plugin without rebuilding (or set a maintainance message) Support	10	3502	22 juli 2020
How do I upgrade Discourse in a multiple container configuration? Self-hosting	2	922	8 oktober 2020
How to speed up container instantiation - if possible at all? Self-hosting	4	305	29 augustus 2023
Is there any faster way to re-build the site? Self-hosting	4	443	30 maart 2024

Hoe voer je groot onderhoud aan Discourse uit met minimale downtime?

Gerelateerde topics