Restarting server causes docker containers to be unresponsive

docker

(Ghost53574) #1

Needless to say but I am super pissed off at Digital Ocean, not you guys.

*Forward
I was testing a firewall script that is very comprehensive, but it was dropping my ssh connection, so I would use Digital Ocean’s web VNC connection to continue testing etc.

*Come a redirect command
I issued the command /usr/local/bin/iptables-cron.bash > ./log since I do not use a GUI on my webserver, honestly who does, and less was not correctly outputting stdout apparently. This caused the VNC session to loop indefinitely, and when I asked their support via a support ticket if they could do something other than restarting the server he told me to change tty sessions which didn’t work because it was stuck looping.

*Regretted server restart
Since Digital Ocean support where incompetent I hastily restarted, hard reset, my server in an attempt to get ssh access again to continue my testing. I knew that doing any restart while services or programs were running would risk data corruption but I did it anyway seeing that I didn’t have any other option.

*Docker container unresponsive
Thirty minutes after the server restart my website is unreachable and the container is unresponsive. Using docker stop <ID> and /var/discourse/launcher stop app both hang and do not return or execute.

Info
root@dng-dev:/var/discourse# docker info
Containers: 4
Images: 35
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 43
Dirperm1 Supported: true
Execution Driver: native-0.2
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
CPUs: 2
Total Memory: 1.963 GiB
Name: dng-dev
ID: YW7J:D7WJ:ROML:ASQI:2UIV:WWEM:DU36:J63S:CJAP:RCMF:33AK:RAD2
WARNING: No memory limit support
WARNING: No swap limit support

Version
root@dng-dev:/var/discourse# docker --version
Docker version 1.6.2, build 7c8fca2

Process info

root@dng-dev:/var/discourse# docker ps
dd40d0e86938        local_discourse/app:latest   "/sbin/boot"        5 days ago          Up 2 hours           0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp, 0.0.0.0:2222->22/tcp   app

Images

root@dng-dev:/var/discourse# docker images
REPOSITORY             TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
local_discourse/app    latest              86248bd2f86d        5 days ago          1.91 GB
discourse/discourse    1.0.15              0706480e0511        5 weeks ago         1.265 GB
<none>                 <none>              8e76a4ddc349        8 weeks ago         1.943 GB
<none>                 <none>              fda2de8839d1        8 weeks ago         1.943 GB
<none>                 <none>              61c12c52a656        8 weeks ago         1.943 GB
<none>                 <none>              70952b600fff        8 weeks ago         1.943 GB
<none>                 <none>              03824520fcb0        8 weeks ago         1.943 GB
<none>                 <none>              234942006bd7        9 weeks ago         1.943 GB
<none>                 <none>              4c6bd5e91f1b        9 weeks ago         1.941 GB
<none>                 <none>              6ad017614911        9 weeks ago         1.941 GB
<none>                 <none>              e266990f9711        9 weeks ago         1.941 GB
<none>                 <none>              168d00eb4d44        9 weeks ago         1.941 GB
<none>                 <none>              db74005b2f8a        10 weeks ago        1.931 GB
samsaffron/discourse   1.0.13              27f52292c186        3 months ago        1.238 GB
<none>                 <none>              eaf68d116145        4 months ago        1.806 GB
hello-world            latest              af340544ed62        4 months ago        960 B
debian                 latest              9a61b6b1315e        5 months ago        125.2 MB
samsaffron/discourse   1.0.12              2285ff53ab18        7 months ago        1.217 GB

History

root@dng-dev:/var/discourse# docker history 86248bd2f86d
IMAGE               CREATED             CREATED BY                                      SIZE
86248bd2f86d        5 days ago          /bin/bash -c cd /pups && git pull && /pups/bi   645.1 MB
0706480e0511        5 weeks ago         /bin/sh -c #(nop) MAINTAINER Sam Saffron "htt   0 B
3f7b4a652638        5 weeks ago         /bin/sh -c #(nop) MAINTAINER Sam Saffron "htt   0 B
709ef95b65df        5 weeks ago         /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B
93bfdb4fbb87        5 weeks ago         /bin/sh -c #(squash) from 5fe96dd0c297          415.7 MB
5fe96dd0c297        5 weeks ago         /bin/sh -c #(squash) from 0a85502c06c9          661.8 MB
2332d8973c93        6 weeks ago         /bin/sh -c #(nop) ADD file:531ac3e55db4293b8f   187.7 MB

Logs

root@dng-dev:/var/discourse# docker logs dd40d0e86938
...
39:M 26 Dec 19:43:14.016 * Background saving started by pid 23037
23037:C 26 Dec 19:43:14.056 * DB saved on disk
23037:C 26 Dec 19:43:14.057 * RDB: 1 MB of memory used by copy-on-write
39:M 26 Dec 19:43:14.116 * Background saving terminated with success
39:M 26 Dec 19:48:15.037 * 10 changes in 300 seconds. Saving...
39:M 26 Dec 19:48:15.038 * Background saving started by pid 23374
23374:C 26 Dec 19:48:15.084 * DB saved on disk
23374:C 26 Dec 19:48:15.085 * RDB: 1 MB of memory used by copy-on-write
39:M 26 Dec 19:48:15.138 * Background saving terminated with success
39:M 26 Dec 19:53:16.043 * 10 changes in 300 seconds. Saving...
39:M 26 Dec 19:53:16.045 * Background saving started by pid 23713
23713:C 26 Dec 19:53:16.085 * DB saved on disk
23713:C 26 Dec 19:53:16.087 * RDB: 1 MB of memory used by copy-on-write
39:M 26 Dec 19:53:16.145 * Background saving terminated with success
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
run-parts: executing /etc/runit/1.d/ensure-web-nginx-read
Started runsvdir, PID is 22
sh: echo: I/O error
fail: redis: runsv not running
rsyslogd: invalid or yet-unknown config file command 'KLogPermitNonKernelFacility' - have you forgotten to load a module? [try http://www.rsyslog.com/e/3003 ]
rsyslogd: warning: ~ action is deprecated, consider using the 'stop' statement instead [try http://www.rsyslog.com/e/2307 ]
rsyslogd: Could no open output pipe '/dev/xconsole': No such file or directory [try http://www.rsyslog.com/e/2039 ]
37:M 26 Dec 19:54:33.591 * Redis 3.0.5 (00000000/0) 64 bit, standalone mode, port 6379, pid 37 ready to start.
37:M 26 Dec 19:54:33.592 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
37:M 26 Dec 19:54:33.592 # Server started, Redis version 3.0.5
37:M 26 Dec 19:54:33.592 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
37:M 26 Dec 19:54:33.592 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
Server listening on 0.0.0.0 port 22.
Server listening on :: port 22.
37:M 26 Dec 19:54:33.749 * DB loaded from disk: 0.157 seconds
37:M 26 Dec 19:54:33.750 * The server is now ready to accept connections on port 6379
2015-12-26 19:54:33 UTC [49-1] LOG:  database system was interrupted; last known up at 2015-12-26 19:53:42 UTC
2015-12-26 19:54:33 UTC [49-2] LOG:  database system was not properly shut down; automatic recovery in progress
2015-12-26 19:54:33 UTC [49-3] LOG:  redo starts at 0/30271210
2015-12-26 19:54:33 UTC [49-4] LOG:  record with zero length at 0/302814C0
2015-12-26 19:54:33 UTC [49-5] LOG:  redo done at 0/30281490
2015-12-26 19:54:33 UTC [49-6] LOG:  last completed transaction was at log time 2015-12-26 19:53:59.344201+00
2015-12-26 19:54:33 UTC [49-7] LOG:  MultiXact member wraparound protections are now enabled
2015-12-26 19:54:33 UTC [25-1] LOG:  database system is ready to accept connections
2015-12-26 19:54:33 UTC [53-1] LOG:  autovacuum launcher started
ok: run: redis: (pid 37) 1s
ok: run: postgres: (pid 25) 1s
supervisor pid: 55 unicorn pid: 60
37:M 26 Dec 19:59:34.032 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 19:59:34.033 * Background saving started by pid 455
455:C 26 Dec 19:59:34.069 * DB saved on disk
455:C 26 Dec 19:59:34.069 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 19:59:34.133 * Background saving terminated with success
37:M 26 Dec 20:04:35.094 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:04:35.094 * Background saving started by pid 784
784:C 26 Dec 20:04:35.130 * DB saved on disk
784:C 26 Dec 20:04:35.131 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:04:35.194 * Background saving terminated with success
37:M 26 Dec 20:09:36.067 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:09:36.068 * Background saving started by pid 1111
1111:C 26 Dec 20:09:36.104 * DB saved on disk
1111:C 26 Dec 20:09:36.105 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:09:36.168 * Background saving terminated with success
37:M 26 Dec 20:14:37.093 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:14:37.094 * Background saving started by pid 1440
1440:C 26 Dec 20:14:37.133 * DB saved on disk
1440:C 26 Dec 20:14:37.134 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:14:37.194 * Background saving terminated with success
37:M 26 Dec 20:19:38.058 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:19:38.059 * Background saving started by pid 1777
1777:C 26 Dec 20:19:38.098 * DB saved on disk
1777:C 26 Dec 20:19:38.099 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:19:38.159 * Background saving terminated with success
37:M 26 Dec 20:24:39.068 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:24:39.068 * Background saving started by pid 2101
2101:C 26 Dec 20:24:39.103 * DB saved on disk
2101:C 26 Dec 20:24:39.104 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:24:39.168 * Background saving terminated with success
37:M 26 Dec 20:29:40.026 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:29:40.026 * Background saving started by pid 2433
2433:C 26 Dec 20:29:40.066 * DB saved on disk
2433:C 26 Dec 20:29:40.066 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:29:40.127 * Background saving terminated with success
37:M 26 Dec 20:34:41.056 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:34:41.057 * Background saving started by pid 2762
2762:C 26 Dec 20:34:41.094 * DB saved on disk
2762:C 26 Dec 20:34:41.095 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:34:41.157 * Background saving terminated with success
37:M 26 Dec 20:39:42.051 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:39:42.051 * Background saving started by pid 3094
3094:C 26 Dec 20:39:42.090 * DB saved on disk
3094:C 26 Dec 20:39:42.091 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:39:42.152 * Background saving terminated with success
37:M 26 Dec 20:44:43.070 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:44:43.070 * Background saving started by pid 3419
3419:C 26 Dec 20:44:43.107 * DB saved on disk
3419:C 26 Dec 20:44:43.107 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:44:43.171 * Background saving terminated with success
37:M 26 Dec 20:49:44.099 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:49:44.100 * Background saving started by pid 3749
3749:C 26 Dec 20:49:44.135 * DB saved on disk
3749:C 26 Dec 20:49:44.136 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:49:44.200 * Background saving terminated with success
37:M 26 Dec 20:54:45.040 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:54:45.041 * Background saving started by pid 4075
4075:C 26 Dec 20:54:45.082 * DB saved on disk
4075:C 26 Dec 20:54:45.083 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:54:45.141 * Background saving terminated with success
37:M 26 Dec 20:59:46.045 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 20:59:46.045 * Background saving started by pid 4403
4403:C 26 Dec 20:59:46.083 * DB saved on disk
4403:C 26 Dec 20:59:46.084 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 20:59:46.145 * Background saving terminated with success
37:M 26 Dec 21:04:47.060 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 21:04:47.061 * Background saving started by pid 4730
4730:C 26 Dec 21:04:47.097 * DB saved on disk
4730:C 26 Dec 21:04:47.098 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 21:04:47.161 * Background saving terminated with success
37:M 26 Dec 21:09:48.080 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 21:09:48.080 * Background saving started by pid 5058
5058:C 26 Dec 21:09:48.116 * DB saved on disk
5058:C 26 Dec 21:09:48.117 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 21:09:48.181 * Background saving terminated with success
37:M 26 Dec 21:14:49.095 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 21:14:49.095 * Background saving started by pid 5382
5382:C 26 Dec 21:14:49.132 * DB saved on disk
5382:C 26 Dec 21:14:49.132 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 21:14:49.195 * Background saving terminated with success
37:M 26 Dec 21:19:50.020 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 21:19:50.021 * Background saving started by pid 5712
5712:C 26 Dec 21:19:50.072 * DB saved on disk
5712:C 26 Dec 21:19:50.073 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 21:19:50.121 * Background saving terminated with success
37:M 26 Dec 21:24:51.052 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 21:24:51.053 * Background saving started by pid 6035
6035:C 26 Dec 21:24:51.099 * DB saved on disk
6035:C 26 Dec 21:24:51.100 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 21:24:51.153 * Background saving terminated with success
37:M 26 Dec 21:29:52.013 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 21:29:52.014 * Background saving started by pid 6363
6363:C 26 Dec 21:29:52.052 * DB saved on disk
6363:C 26 Dec 21:29:52.052 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 21:29:52.114 * Background saving terminated with success
37:M 26 Dec 21:34:53.091 * 10 changes in 300 seconds. Saving...
37:M 26 Dec 21:34:53.092 * Background saving started by pid 6689
6689:C 26 Dec 21:34:53.137 * DB saved on disk
6689:C 26 Dec 21:34:53.138 * RDB: 1 MB of memory used by copy-on-write
37:M 26 Dec 21:34:53.192 * Background saving terminated with success
Shutting Down
run-parts: executing /etc/runit/3.d/01-nginx
ok: down: nginx: 1s, normally up
run-parts: executing /etc/runit/3.d/02-unicorn
exiting
ok: down: unicorn: 0s, normally up
run-parts: executing /etc/runit/3.d/10-redis
37:signal-handler (1451165695) Received SIGTERM scheduling shutdown...
37:M 26 Dec 21:34:55.397 # User requested shutdown...
37:M 26 Dec 21:34:55.397 * Saving the final RDB snapshot before exiting.
37:M 26 Dec 21:34:55.438 * DB saved on disk
37:M 26 Dec 21:34:55.439 # Redis is now ready to exit, bye bye...
ok: down: redis: 0s, normally up
run-parts: executing /etc/runit/3.d/99-postgres
2015-12-26 21:34:55 UTC [25-2] LOG:  received smart shutdown request
2015-12-26 21:34:55 UTC [53-2] LOG:  autovacuum launcher shutting down
2015-12-26 21:34:55 UTC [50-1] LOG:  shutting down
2015-12-26 21:34:55 UTC [50-2] LOG:  database system is shut down
ok: down: postgres: 1s, normally up
run-parts: executing /etc/runit/3.d/99-ssh
Received signal 15; terminating.
ok: down: sshd: 0s, normally up

Oh and I get this sometimes when I C^ time="2015-12-26T17:31:11-05:00" level=fatal msg="write /dev/stdout: broken pipe". Also the website is dirtnapgaming.com

Thanks and I’m going to work will read and update when I get back D:


(Ghost53574) #2

Solved the problem by restoring my droplet and restoring a recent version of my discourse install (I back up every three days and have the server delete old backups longer than 7 days).

I think a simple solution would to just make a script while shutdown that checks for active containers and waits until they turn off. I’ll probably just make it later on. Will update when completed.


(Tom Hsiung) #3

Same issue, Discourse site had gone after server rebooted.