Runsv hanging on Docker container shutdown


(Andy Balholm) #21

The problem isn’t exclusive to Discourse, since @tgxworld reproduced it with a Docker container containing almost nothing but runit. (Although Discourse might be able to make a workaround.)


(Ted Strauss) #22

I am dealing with the same issue on Ubuntu 14.
kill -9 has no effect on the process.


(Rafael dos Santos Silva) #23

Here comes a new challenger:

http://engineeringblog.yelp.com/2016/01/dumb-init-an-init-for-docker.html


(Luke) #24

For anyone else who needs a working solution, I can confirm that docker kill <container name> successfully stops and kills the process. Not elegant, but it works and you can be in eternal discourse heaven. :wink:


(Thisgeekza) #25

Thanks. I must say, this issue is a pain in the backside when trying to rebuild containers.


(Luke) #26

Couldn’t agree more, hopefully it gets fixed sometime in the near future. Luckily there is a temp fix… kind of annoying having to reboot a node if docker hangs though.


(Sam Saffron) #27

Can you try adding this to your template and rebuilding?

run:
  - exec: cd /sbin && wget https://github.com/krallin/tini/releases/download/v0.8.4/tini-static
  - exec: chmod +x /sbin/tini-static
  - exec: mv /sbin/boot /sbin/boot_image
  - file:
     path: /sbin/boot
     chmod: "+x"
     contents: |
       #!/bin/sh
       exec /sbin/tini-static /sbin/boot_image

Does the issue go away? cc @Luke

Confirm tini is running as pid zero after that change, ./launcher enter appps aux you should see tini-static.

Note if this is accepted into tini we can get rid of our boot shell script. Ability to execute a command on start and exit · Issue #28 · krallin/tini · GitHub


From DISCOURSE_SMTP_ADDRESS in discourse_docker to /var/www/discourse/config/discourse.conf
(Luke) #28

I’ll get working on it. Furthermore, I was incorrect, this is also occuring in Debian stable (jessie), I no longer think this is specific to ubuntu.


(Paolo G. Giarrusso) #29

I’ve tried this, but it seems it didn’t work for me; cron was left hanging, so I rebooted. BTW, I’ve noticed that some processes lack a shutdown script in /etc/runit/3.d/, cron included.

But maybe we need -g to kill whole process groups, since we launch bash which doesn’t propagate signals, as in the example here?

So I’ve tried changing /sbin/boot's invocation of tini to:
exec /sbin/tini-static -g -- /sbin/boot_image. Now I’ve been able to restart the container on 2/2 attempts!


(Sam Saffron) #30

interesting…

the way the boot script works it forwards a HUP to runsvdir if it gets a TERM which docker stop sends it. If we move to using -g we are not going to terminate the processes potentially out-of-order … hmmm


(Sam Saffron) #31

this should be simple to fix, maybe all we need is a script to kill the cron daemon in /3.d/ can you try that?

It’s very hard for me to test this stuff cause I can not repro the issue.

Note, when runsvdir gets a hup it is meant to send a term to each of its chidren.


(Paolo G. Giarrusso) #32

Seems to work. I’ve added all missing scripts:

docker exec -it info1-discourse env TERM=xterm /bin/bash
cd /etc/runit/
echo -e '#!/bin/bash\nsv stop cron' > 3.d/99-cron
echo -e '#!/bin/bash\nsv stop rsyslog' > 3.d/99-rsyslog
chmod +x 3.d/*

then I’ve removed -g from /sbin/boot, restoring the previous content. I’ve done a few restarts, all seems fine.
But you’ll probably want to change priorities for other procs — I guess you want cron and rsyslog to shutdown after postgres and ssh, so the latter ones should change to priority 98.

As @Luke suspected, the kernel is involved — but you need the “wrong” kernel. With ubuntu 14.04, that’s a kernel from 3.13.0-72 onwards.
I’ve argued why the kernel is involved here, armed with my kernel stack trace:

Furthermore, the AUFS issue is introduced by [“mm: make sendfile(2) killable”] (mm: make sendfile(2) killable · torvalds/linux@296291c · GitHub), which was backported by Ubuntu into 3.13.0-72 — as also mentioned in Docker 1.9.1 hanging at build step “Setting up ca-certificates-java” · Issue #18180 · moby/moby · GitHub. I upgraded from -71 to -74, and this explains what triggered this bug. @andybalholm reports upgrading to -73, which also fits.

Besides, unkillable processes are always kernel-level bugs.

I think the init problems are just triggering the AUFS bug: IIUC the code, if a process gets a SIGKILL at the right point while writing on AUFS, it’s screwed — so we’re just replacing SIGKILL with SIGTERM. One source of SIGKILL might be docker stop timing out, but maybe there’s more? Now docker stop takes 3.6 seconds, by default it waits for 10s before sending SIGKILL.

Kind-of, but worse. AUFS calls an underlying write “syscall” (from kernel space) in a loop handling EINTR; with newer kernels, if a SIGKILL is pending, each loop iteration will fail with EINTR, while SIGKILL will not be delivered between loop iterations, since the loop is in kernel space.

Looking at the bugfix might explain a bit more (or not). But beware: This part of AUFS code looks far from pretty:


(Sam Saffron) #33

I see and thank you for the diagnostics, the thing I find perplexing is that boot has this line:

trap "echo Shutting Down && /etc/runit/3 && kill -HUP $RUNSVDIR && wait $RUNSVDIR" SIGTERM SIGHUP

runsvdir indeed sends TERMs to every process it started (I tested this a lot), so it should not be needed to add explicit stops for stuff where ordering of stop is not required. (like cron / logrotate)

http://smarden.org/runit/runsvdir.8.html

If runsvdir receives a TERM signal, it exits with 0 immediately.
If runsvdir receives a HUP signal, it sends a TERM signal to each runsv(8) process it is monitoring and then exits with 111.

I tested this and it is indeed the case … which leaves 2 options here …

  • Somehow /etc/runit/3 is returning a non 0 status code (we got to fix it so it does not matter what happens it runs)
trap "echo Shutting Down && (/etc/runit/3 || echo "failed to stop" && exit 0) && kill -HUP $RUNSVDIR && wait $RUNSVDIR" SIGTERM SIGHUP
  • Somehow /etc/runit/3 is taking longer than 10 seconds to run

Any chance you can help debugging this by changing that trap line?


(Fábio Machado De Oliveira) #34

When /sbin/boot is running without shell, it takes pid 1 and in my tests become unstoppable.

This is my hypothesis, and my few tests seem to confirm it

Changed
CMD [’/sbin/boot’]

For
CMD /sbin/boot

And the extra bash process prevents it from being pid 1, and the problem didn’t reproduce anymore in my test.


(Sam Saffron) #35

Hmm, but we do not specify CMD anywhere, launcher simply runs it, Dockerfiles do not have a CMD


(Fábio Machado De Oliveira) #36

I have to test if the official container is affected, it was in an experiment.


(Fábio Machado De Oliveira) #37

Changing the command exec command for

sh -c /sbin/boot/

Appears to have reduced the frequency it happens for me, but it still happened when I started and stopped the container several times. It could have been a coincidence, I’m confused with this issue.


(Paolo G. Giarrusso) #38

TL;DR. Looks like that’s never worked properly – the script won’t actually exit unless you add an explicit exit, as in:

trap "echo Shutting Down && /etc/runit/3 && kill -HUP $RUNSVDIR && wait $RUNSVDIR; exit" SIGTERM SIGHUP

Maybe, for extra safety from your concerns on failures, also stop using &&?

trap "echo Shutting Down; /etc/runit/3; kill -HUP $RUNSVDIR; wait $RUNSVDIR; exit" SIGTERM SIGHUP

Using the first line was harmless (reboot still worked); but after rm /etc/runit/3.d/99-rsyslog, restarting the container made runsv unicorn hang using 100% CPU (with usual symptoms), so I don’t recommend it. I’m playing around a bit to fix this, starting from the 2nd line.

Explanation

I indeed hadn’t bothered checking why SIGKILL was sent, and can’t try that in production (except at night, but I do prefer). Moreover, there’s tons of possible sources for SIGKILL — it’s usually rather harmless, after all, so bugs could go undetected indefinitely, I thought. I didn’t expect the latent bug to lie in plain sight, and I didn’t notice it when reading the script.

I figured this is reinventing a wheel we probably shouldn’t be reinventing. So I looked at my Gitlab’s image for inspiration, and was enlightened by reading the line

trap "sigterm_handler; exit" TERM

What’s that exit for? Turns out that’s not implied — and if it were otherwise, how would you disable the implied exit?

To wit, save and run this script, and try killing it with Ctrl-c or by killing the printed PID with SIGTERM or SIGINT. Neither will work (you’ll only see foo appearing), you’ll need another signal for killing (e.g. SIGKILL).

echo $$
trap 'echo foo' TERM INT
while :; do :; done

On Ubuntu since 2006 and on Discourse images, /bin/sh is dash, not bash. Signal handling might indeed be different somehow; docs don’t document the same behavior, but are rather incomplete. However, dash is not designed to be a proper init process, so I don’t recommend pursuing that avenue.


(Paolo G. Giarrusso) #39

Switching to the 2nd line (without &&), and readding /etc/runit/3.d/99-rsyslog, made redis crash this time. I’m confused, but for now I’m giving up debugging on a production machine (though at night, where my users mostly sleep).


(Felix Freiberger) #40

This has started happening to me too, running Ubuntu (3.16.0-59-generic #79~14.04.1-Ubuntu SMP Mon Jan 18 15:41:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux). It seems to happen every time the container is stopped, docker kill works fine. The server was running fine for a long time – in the meantime, I only installed updates to the system and Discourse.