Ubuntu 20.04 kernel update with docker causing a crash on EC2 and Lightsail

I ran into this issue last night, when my Ubuntu 20.04 LTS automatically upgraded itself, it installed a new kernel and I lost control of the system, it would just crash a few minutes after booting. I tried it again today with a fresh Discourse install and as soon as I upgraded the system it started crashing again.

Just a note for folks, don’t update your linux kernels just yet, this is a known bug - see this for more details.

The question being is there a way to start the system without having it start Discourse/docker? Running on AWS Lightsail. The only other option is to rebuild the whole system again which is PITA right now given the backup/restore issues I’m facing.

EDIT: This is what I found, hit or miss depending on how fast it comes up.

while true; do
  ssh <instance> "sudo systemctl disable docker.service; sudo systemctl disable containerd.service"
done

I had this happen on two EC2 instances as well. They went down at 5AM EDT for a reboot and never came back up.

Per the link this affects people running canonical “cloud kernels” in ubuntu machines. They removed a patch that affects OverlayFS.

While Canonical rolls a fix people can try a different kernel version or using Debian / other distro as a workaround.

I was able to interrupt the cycle using a quick SSH about 15 seconds after it starts to disable the docker/container services. Downgraded the kernel to 5.4 and it seems to be working

Yes, as I just posted in your other thread on restore troubles, that was essentially what I did as well when this bug crashed my server. Well, I booted the old kernel; didn’t have to disable docker or containers. And the current kernel is safe again. Here is a link to what I said in your other thread. In a bit I’ll try to write up my permanent solution to keep this from happening again.

Nasty kernel bug, that was!

You simply can revert to the previous kernel and the machine is restored. Or update to the current, fixed kernel, which came out on Thursday.

I have written up a tutorial on how to avoid kernel oops! issues like this that crash your server or keep it from coming back up.

I put the tutorial on my Discourse site, since that seemed convenient to me. My site has nothing to do with tech, though. So I unlisted the topic but published it to HTML.

Enjoy.

https://discourse.bluebottlefly.com/pub/hardening-your-server

@RBoy, maybe you in particular will find this useful.

/dr