Hi, folks. This feels like a dumb question, but I’ll ask it anyway.
I was stuck in this familiar loop:
Upgrade failed.
Try ./launcher rebuild app, but not enough disk space. (Under 5 GB free.)
./launcher cleanup, which seemed to liberate some disk space. (Over 5 GB free.)
GO TO 2.
Since I don’t quite yet know which operations are safe (as in won’t destroy my site data), I stumbled around nervously, hoping not to need to restore from a backup. I ended up doing this and it seemed to work for me.
./launcher stop app
./launcher cleanup, which cleaned up significantly more space than it did previously, which made me immediately nervous.
git pull
./launcher rebuild app
This not only worked, but resulted in a running site with data intact and with more free space than I had previously by approximately 2 GB.
Now I’m not sure whether what I did was brilliant, obvious, or risky and it just happened to work. I’d like to understand a little better why I got the results I got and whether that was a sensible way to upgrade.
docker creates a bunch of files when it builds a container (include the containers that it builds) when you build a new container, you still have the old containers (and also disk images). That’s what gets removed when you do a cleanup.
It’s a little bit safer to run the cleanup command while discourse is running, because the cleanup command won’t delete a running container. If something goes wrong with the rebuild, you can still restart the old container if it exists.
So, I’d remove step (1) above and step (3) is unnecessary because ./launcher does a pull (back in the old days, it didn’t).
So yeah, if your disk space is so tight that you don’t have room for two images, then you need to destroy all of them before building a new one, which puts you in a bad way if for some reason you can’t build a new image. If that’s the case, you really need more disk space. The easiest thing to do is move backups to S3.
I’m surprised by “disk space is so tight” in my situation. I run cleanup, I have 7.9 GB free, then I run ./launcher rebuild app and it refuses to finish. Is 7.9 GB free when I want to rebuild the app really so little?!
(UPDATE: Now I understand how this can happen. I leave this here for Innocent Bystanders to find in a web search. Please read this entire thread, folks!)
For this particular rebuild you needed to download a new base image
It would be particularly helpful to include this information somehow in the upgrade process, either as an advance warning “We need to download a new base image, so you probably need 3 GB more space than normal to upgrade. If you run out of space, that’s probably why.” or as a more-detailed error when the upgrade/rebuild process runs out of space. Otherwise, we end up in an endless loop of “There! There’s all the space you (seem to) need!” (Picard with arm outstretched.)
It would also be particularly helpful not to have to dig quite so much to find the option to use ./launcher rebuild app --skip-prereqs when you mean “Trust me, bro. I have enough disk space.” Yes, put warnings in bold, red, 72-point font.
This “new base image makes upgrading funky” happens infrequently enough for us to forget it, but causes problems during upgrade. Where is the appropriate place for an article about this that centralizes what we know about how to work around the issues?