I updated my sites from 3.1.0.beta1 to 3.1.0.beta2, and after bootstrapping the new version, but before destroying the old app containers and starting new ones, at least one of those sites started giving the generic error page to users.
I didn’t notice it on my test site or the other sites I run, but it’s possible that it happened and I didn’t see it.
In any case, for me in at least one case, the “zero downtime” update process did not succeed.
(I use app for the web app even for multi-container installs. I know it’s not normal practice. I hate typing web_only)
A sometime after I started bootstrap and before I destroyed the app, the old version running against the new database showed only an error screen. I don’t remember the contents, and I didn’t create a longer outage by stopping to take a screenshot before doing the destroy/start, but it was only text on white and was not the system maintenance page. I have seen this only a few times before, that when the bootstrap runs db:migrate as part of the asynchronous “zero-downtime” rebuild, the old software still running fails due to a schema inconsistency.
What I saw was whatever happens in the case of database inconsisency. That’s way better than blissfully soldiering on, breaking the database! When I posted, it was to warn that this was one of those rare cases where applying a point update (here from 3.1.0.beta1 to 3.1.0.beta2) created a schema incompatibility between the 3.1.0.beta1 code and the database after running the 3.1.0.beta2 db:migrate, as happens rarely but occasionally with the normal low-downtime updates in the multi-container deployment.
My experience is different from the error that has been reported with ruby in the GUI updater. It’s a completely unrelated problem. I recognize that my post was moved out of the announcement into a general “problems with” thread, but I want to be clear that the reason I posted it in the announce was to warn other self-hosters like me when they saw the announcement that this particular update was one that could have this impact.
My message was not complaining about a bug, or even a problem. It was intended only as notice of a normal but infrequent case associated with this particular release and not called out in the release notes.
The complaints about the docker manager not recognizing that it can’t update from within the image are completely unrelated to my attempt to provide a helpful notification to other self-hosting admins.
It would make a lot of sense to separate these unrelated issues into independent threads for independent problems. EDIT by @supermathie: Done
I think that answers the question. The launcher script has no support for SKIP_POST_DEPLOYMENT_MIGRATIONS
Again, I am not reporting a bug. Just trying to warn others with the standard multi-container install using the normal documented process for using launcher with a multi-container install that this one is different from their typical experience.
Really truly, honestly, I mean it, this is not a bug report!
If I want blue/green deployment with launcher I should provide a PR for launcher to implement it.
I did not come up with the “problem” in the topic title; that was done when my comment was moved out of the announcement thread. I have now modified the title to make it clear, I hope, that I’m not complaining about a problem.
I’ve been having lots of trouble getting worked up about it when I am providing four, some months four and a half nines of availability on a service that I run for free in my spare time. It’s a testament to the quality of Discourse development that I can do that on a tests-passed policy, including things like the extra minute or so of downtime I saw this time, and sometimes rebooting the host for security updates.
The ansible script that dashboard.literatecomputing.com uses runs a rake task after the new container is launched to do the post migrations. It counts on having SKIP_POST_DEPLOYMENT_MIGRATIONS turned on in the web_only.yml. I do this only on sites that I know will be managed by my scripts since if you don’t understand how it works if something of a time bomb.
Note that for many upgrades bootstrapping the new container won’t break things for the running container, but for some it does. It’s not that uncommon that an upgrade will migrate such that the old container can’t use the database (without using SKIP_POST_DEPLOYMENT_MIGRATIONS).