Yesterday I was doing a routine upgrade and the ./launch rebuild command failed to complete. I thought that if I started the old container with ./launch start the site will come up as before but It never went online, Redis kept failing to start.
What’s the right way to rollback an upgrade? Or at least run the old container before it is removed by ./launcher rebuild command.
I don’t think you understand what the problem is, the rebuild failed. The app wasn’t running. This is not related to plugins, it failed to create the new docker image.
This is the error I’m referring to Manual upgrade fails to complete. It is not related to plugins or any configuration I have with Discourse.
Well, it did in this case. The server hasn’t changed at all since our last update. And it failed in the process of creating a Docker image with Docker, which should isolate the environment for these things not to happen. If you see is a permissions issue, I fixed it with an extra step in the app.yml which shouldn’t be required.
What I’m asking is how I run the old docker container before the rebuild the failed. It is already created with a previous version of Discourse so it should be possible to run it.
I had to trace down the issue and fix it before the site going up again (3 hours of downtime). If things like this happen there should be a way to run the previous docker image you had while you debug the upgrade. Even if it’s not a simple command like ./launcher start app
Should do it, unless the database has migrated, in which case you can have problems.
If you want to revert to an old version and then restore your backup, you can delete the postgres_data directory, change tests-passed to the version you want (either the beta or an actual commit) and then restore the data.
I’m confused about your permissions issue. I haven’t done an upgrade, but I did a couple installs today and they worked fine.
I shouldn’t be messing with branches or anything, this is the Docker based installation. I know is weird that I had permissions issues because I don’t control these things. Discourse rebuild script and Docker are the ones that do all work.
The database didn’t migrate, and that doesn’t mean that’s going to be an issue. I’ve worked with Rails applications in the past and is not a rule. At least they start and give you an error.
I tried with ./launcher start app but that didn’t work, Redis kept failing to start, I had to complete the rebuild process for it to start correctly. I’m not sure if when a rebuild failure somehow affected it’s functionality. I know at the end of the rebuild Discourse runs the container passing the env variables to docker inline, maybe ./launcher start app doesn’t do the same things or it was affected by the failed rebuild.
Also, we are using AWS RDS for our Postgres instance, so in case the rebuild failed we can rollback to a previous state exactly how we need it (we backup both in Discourse and RDS before doing upgrades). The database lives outside of the Discourse instance as well as uploads/assets, so we can delete the instance and rebuild it when we need it.