Ngnix fails to start on VM with IPv6 disabled

A Discourse server I’m trying to upgrade is stuck restarting (and due to the constant restarts I can’t seem to be able to get a shell on it), this is what I have from ./launcher logs app:

[Fri 29 Mar 12:51:47 UTC 2019] Run reload cmd: sv reload nginx
warning: nginx: unable to open supervise/ok: file does not exist
[Fri 29 Mar 12:51:47 UTC 2019] Reload error for :
nginx: [error] open() "/run/nginx.pid" failed (2: No such file or directory)
run-parts: /etc/runit/1.d/letsencrypt exited with return code 1

Done any have any suggestions regarding what I could try next?

I’d check disk space and rebuild the container.

إعجاب واحد (1)

The virtual server has one partition and it doesn’t seem too bad in terms of usage, do you think more space might be the issue here?

df -h | egrep '(/$|^Filesystem)'
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda2        64G   44G   17G  72% /

I have tried rebuilding the container several times already…

Do you have plugins installed? What happens when you rebuild? (Space looks fine.)

إعجاب واحد (1)

The rebuild seems to go OK but then the container keep restarting and I can’t get a shell on the container because of this — I want to see if /run exists — the error logs above appear to be saying that this directory doesn’t exist…

After a rebuild with all the plugins disabled it looks OK to start with:

docker ps 
  CONTAINER ID        IMAGE                 COMMAND             CREATED             STATUS                  PORTS                                      NAMES
  5185f5987314        local_discourse/app   "/sbin/boot"        11 seconds ago      Up Less than a second   0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp   app

But then get’s stuck in a loop that looks like this:


docker ps 
  CONTAINER ID        IMAGE                 COMMAND             CREATED              STATUS                            PORTS               NAMES
  5185f5987314        local_discourse/app   "/sbin/boot"        About a minute ago   Restarting (100) 39 seconds ago                       app

I’m tried another rebuild with this to the end of containers/app.yml:

## Any custom commands to run after building
run:
  - exec: echo "Beginning of custom commands"
  - exec: ls -lah /run
  - exec: df -h
  - exec: file /run/nginx.pid

The results of this were:

I, [2019-03-29T13:29:39.736562 #14]  INFO -- : total 36K
drwxr-xr-x 1 root     root     4.0K Mar 29 13:23 .
drwxr-xr-x 1 root     root     4.0K Mar 29 13:23 ..
lrwxrwxrwx 1 root     root       25 Feb 22 10:06 initctl -> /run/systemd/initctl/fifo
-rw-r--r-- 1 root     root        0 Mar 21 00:43 init.upgraded
drwxrwxrwt 3 root     root     4.0K Feb 22 10:06 lock
drwxr-xr-x 2 root     root     4.0K Feb 22 10:06 log
drwxr-xr-x 2 root     root     4.0K Feb 22 10:05 mount
lrwxrwxrwx 1 postgres postgres   20 Mar 29 13:23 postgresql -> /shared/postgres_run
drwxr-xr-x 2 root     root     4.0K Feb 22 10:06 sendsigs.omit.d
lrwxrwxrwx 1 root     root        8 Feb 22 10:06 shm -> /dev/shm
drwx--x--x 3 root     root     4.0K Mar 21 00:42 sudo
drwxr-xr-x 1 root     root     4.0K Mar 21 00:42 systemd
drwxr-xr-x 2 root     root     4.0K Feb 22 10:06 user
-rw-rw-r-- 1 root     utmp        0 Feb 22 10:05 utmp

I, [2019-03-29T13:29:39.737355 #14]  INFO -- : > df -h
I, [2019-03-29T13:29:39.742286 #14]  INFO -- : Filesystem      Size  Used Avail Use% Mounted on
overlay          64G   45G   16G  74% /
tmpfs            64M     0   64M   0% /dev
tmpfs           1.5G     0  1.5G   0% /sys/fs/cgroup
/dev/vda2        64G   45G   16G  74% /shared
shm             512M  8.0K  512M   1% /dev/shm
tmpfs           1.5G     0  1.5G   0% /proc/acpi
tmpfs           1.5G     0  1.5G   0% /sys/firmware

I, [2019-03-29T13:29:39.743254 #14]  INFO -- : > file /run/nginx.pid
I, [2019-03-29T13:29:39.755033 #14]  INFO -- : /run/nginx.pid: cannot open `/run/nginx.pid' (No such file or directory)

So the issue does appear to be that /run/nginx.pid doesn’t exist and I expect that cause of this is a problem with the Nginx config? So perhaps I’ll check that now…

Do you have anything in your yml that affects the nginx config?

إعجاب واحد (1)

This plugin was enabled (but isn’t for now):

https://github.com/muhlisbc/discourse-images-guardian

And I’m currently trying to work out if this commit broke the Nginx config:

https://github.com/muhlisbc/discourse-images-guardian/commit/8a79b86d0a0f344504a75a38eb417c820787b3bb

I can’t see anything in the app.yml file that could affect the Nginx config, and the YAML appears to be valid, I just tested it with yamllint containers/app.yml and the only issues were with whitespace and line length.

To try to shed some light on the problem:

## Any custom commands to run after building
run:
  - exec: echo "Beginning of custom commands"
  - exec: ls -lah /etc/nginx/
  - exec: /usr/sbin/nginx -t -c /etc/nginx/nginx.conf
  - exec: /usr/sbin/nginx -t -c /etc/nginx/letsencrypt.conf
  - exec: /usr/sbin/nginx -t -c /etc/nginx/fastcgi.conf

And this resulted in:

I, [2019-03-29T14:28:53.182369 #14]  INFO -- : > /usr/sbin/nginx -t -c /etc/nginx/nginx.conf
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

Which looks good, but then:

I, [2019-03-29T14:28:53.209517 #14]  INFO -- : > /usr/sbin/nginx -t -c /etc/nginx/letsencrypt.conf
nginx: the configuration file /etc/nginx/letsencrypt.conf syntax is ok
nginx: [emerg] socket() [::]:80 failed (97: Address family not supported by protocol)
nginx: configuration file /etc/nginx/letsencrypt.conf test failed
I, [2019-03-29T14:28:53.218977 #14]  INFO -- : 

...


FAILED
--------------------
Pups::ExecError: /usr/sbin/nginx -t -c /etc/nginx/letsencrypt.conf failed with return #<Process::Status: pid 2355 exit 1>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params "/usr/sbin/nginx -t -c /etc/nginx/letsencrypt.conf"
f42e20c75dda3fa8b80905d1ad3159a30ca16aea69bc2a7f392cf566f33e02da
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one

So the issues is with the IPv6 address? The server does have the following in /etc/sysctl.conf — the VM isn’t setup for IPv6:

# https://serverfault.com/a/660985
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

So perhaps I need to disable IPv6 for Nginx and or Docker?

Or perhaps sed can save the site?

## Any custom commands to run after building
run:
  - exec: echo "Beginning of custom commands"
  - exec: sed -e 's/listen \[::\]:80;/#listen [::]:80;/' -i /etc/nginx/letsencrypt.conf
  - exec: /usr/sbin/nginx -t -c /etc/nginx/letsencrypt.conf

Don’t disable IPv6 support. From the SF answer you linked:

image

إعجاب واحد (1)

But this virtual server don’t have IPv6 configured and it appears that as a result the IPv6 line in /etc/nginx/letsencrypt.conf:

    listen [::]:80;

Is the thing that is stopping Nginx from starting — I’m not sure what solution you are suggesting, my solution appears to work:

I, [2019-03-29T14:57:26.021690 #14]  INFO -- : Beginning of custom commands

I, [2019-03-29T14:57:26.023075 #14]  INFO -- : > sed -e 's/listen \[::\]:80;/#listen [::]:80;/' -i /etc/nginx/letsencrypt.conf
I, [2019-03-29T14:57:26.030390 #14]  INFO -- : 
I, [2019-03-29T14:57:26.030529 #14]  INFO -- : > /usr/sbin/nginx -t -c /etc/nginx/letsencrypt.conf
nginx: the configuration file /etc/nginx/letsencrypt.conf syntax is ok
nginx: configuration file /etc/nginx/letsencrypt.conf test is successful
I, [2019-03-29T14:57:26.045517 #14]  INFO -- : 

There’s no reason to disable IPv6 in the VM, you’re just setting yourself up for trouble.

إعجاب واحد (1)

I agree that ideally the VM server and the DNS would be configured for IPv6, but this isn’t the case with this server and I might not be the only person still running a server using IPv4 so it appears fairly reasonable to me to disable IPv6 in the Nginx config in the Docker container, it is a simply one line solution in containers/app.yml and it does the job for now:

## Any custom commands to run after building
run:
  - exec: echo "Beginning of custom commands"
  - exec: sed -e 's/listen \[::\]:80;/#listen [::]:80;/' -i /etc/nginx/letsencrypt.conf

As far as I’m concerned this “solution” is good enough for now…

FWIW, I just did an install on a DigitalOcean droplet that does not have ipv6 configured and there is no problem.

Here’s this from an install from a couple days ago:

root@forum:~# grep ipv6 /etc/sysctl.conf 
#net.ipv6.conf.all.forwarding=1
#net.ipv6.conf.all.accept_redirects = 0
#net.ipv6.conf.all.accept_source_route = 0

This stuff doesn’t’ hurt anything even though the droplet does not have ipv6 routed to it.

إعجاب واحد (1)

But IPv6 LLAs, localhost, and other addresses are still turned on – that’s the difference.

إعجاب واحد (1)

Note that the lines you quoted from /etc/sysctl.conf appear to all be commented out — they start with a #.

I do need to get my head around IPv6 and start using it, I have been putting it off until I have no choice and perhaps that day is almost here…

Oops! No. I failed to note that! I think that the take-home message is that if you don’t want to deal with ipv6 (I’ve not learned much about it myself) you can leave out anything to do with ipv6 in /etc/sysctl.conf and you’ll be fine. :slight_smile:

إعجاب واحد (1)

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.