Nginx.http.sock bind failed after reboot

countcb · June 2, 2015, 1:17pm

Hello all,

I have a discourse forum up and running for 2 months now.
I installed it via Docker following the official Instructions. I have it running with nginx to proxy pass requests to discourse.

Everything was working fine. Today my hosting company had to reboot my machine. And now I’m getting a 502 Bad Gateway error when trying to access my forum.

I found out, that the problem seems to be the following:

 [emerg] 781#0: bind() to unix:/shared/nginx.http.sock failed (98: Address already in use)

So the container cant bind to the socket it seems?

nginx gives following error as was to be expected

[error] 2298#0: *1 connect() to unix:/var/discourse/shared/standalone/nginx.http.sock failed (111: Connection refused) while connecting to upstream, client: xx.yy.zzz.aaa, server: forum.example.com, request: "GET / HTTP/1.1", upstream: "http ://unix:/var/discourse/shared/standalone/nginx.http.sock:/", host: "forum.example.com"

My app.yml looks like this:

app.yml

##
## After making changes to this file, you MUST rebuild for any changes
## to take effect in your live Discourse instance:
##
## /var/discourse/launcher rebuild app
##
## Make sure to obey YAML syntax! You can use this site to help check:
## http ://www.yamllint.com/

## this is the all-in-one, standalone Discourse Docker container template

# You may add rate limiting by uncommenting the web.ratelimited template.
# Out of the box it allows 12 reqs a second per ip, and 100 per minute per ip
# This is configurable by amending the params in this file

templates:
  - "templates/postgres.template.yml"
  - "templates/redis.template.yml"
  - "templates/web.template.yml"
  - "templates/sshd.template.yml"
  - "templates/web.ratelimited.template.yml"
  - "templates/web.socketed.template.yml"

## which TCP/IP ports should this container expose?
expose:
  - "2222:22" # fwd host port 2222 to container port 22 (ssh)

params:
  db_default_text_search_config: "pg_catalog.english"

  ## Set db_shared_buffers to a max of 25% of the total memory.
  ##
  ## On 1GB installs set to 128MB (to leave room for other processes)
  ## on a 4GB instance you may raise to 1GB
  #db_shared_buffers: "256MB"
  #
  ## Set higher on large instances it defaults to 10MB, for a 3GB install 40MB is a good default
  ## this improves sorting performance, but adds memory usage per-connection
  #db_work_mem: "40MB"
  #
  ## Which Git revision should this container use? (default: tests-passed)
  #version: tests-passed

env:
  LANG: en_US.UTF-8
  # DISCOURSE_DEFAULT_LOCALE: en

  ## TODO: How many concurrent web requests are supported?
  ## With 2GB we recommend 3-4 workers, with 1GB only 2
  #UNICORN_WORKERS: 3

  ## TODO: List of comma delimited emails that will be made admin and developer
  ## on initial signup example 'user1@example.com,user2@example.com'
  DISCOURSE_DEVELOPER_EMAILS: 'mail@example.com'

  ## TODO: The domain name this Discourse instance will respond to
  DISCOURSE_HOSTNAME: 'forum.example.com'

  ## TODO: The mailserver this Discourse instance will use
  DISCOURSE_SMTP_ADDRESS: ***         # (mandatory)
  DISCOURSE_SMTP_PORT: 587                        # (optional)
  DISCOURSE_SMTP_USER_NAME: ***      # (optional)
  DISCOURSE_SMTP_PASSWORD: ***              # (optional)
  #DISCOURSE_SMTP_ENABLE_START_TLS: true           # (optional, default true)

  ## The CDN address for this Discourse instance (configured to pull)
  #DISCOURSE_CDN_URL: //discourse-cdn.example.com

## These containers are stateless, all data is stored in /shared
volumes:
  - volume:
      host: /var/discourse/shared/standalone
      guest: /shared
  - volume:
      host: /var/discourse/shared/standalone/log/var-log
      guest: /var/log

## The docker manager plugin allows you to one-click upgrade Discourse
## http ://discourse.example.com/admin/docker
hooks:
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - mkdir -p plugins
          - git clone https ://github.com/discourse/docker_manager.git
          - git clone https ://github.com/discourse/discourse-tagging.git
          - git clone https ://github.com/discourse/discourse-spoiler-alert.git

## Remember, this is YAML syntax - you can only have one block with a name
run:
  - exec: echo "Beginning of custom commands"

  ## If you want to set the 'From' email address for your first registration, uncomment and change:
  #- exec: rails r "SiteSetting.notification_email='info@unconfigured.discourse.org'"
  ## After getting the first signup email, re-comment the line. It only needs to run once.

  ## If you want to configure password login for root, uncomment and change:
  ## Use only one of the following lines:
  #- exec: /usr/sbin/usermod -p 'PASSWORD_HASH' root
  #- exec: /usr/sbin/usermod -p "$(mkpasswd -m sha-256 'RAW_PASSWORD')" root

  ## If you want to authorized additional users, uncomment and change:
  #- exec: ssh-import-id username
  #- exec: ssh-import-id anotherusername

  - exec: echo "End of custom commands"
  - exec: awk -F\# '{print $1;}' ~/.ssh/authorized_keys | awk 'BEGIN { print "Authorized SSH keys for this container:"; } NF>=2 {print $NF;}'

My nginx config looks like this:

nginx config

server {
	listen 80;
	# change this
	server_name forum.example.com;

	location / {
        proxy_pass http ://unix:/var/discourse/shared/standalone/nginx.http.sock:;
		proxy_set_header Host $http_host;
		proxy_http_version 1.1;
		proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

        access_log /var/log/nginx/discourse-access.log;
        error_log /var/log/nginx/discourse-error.log;
	}
}

I tried the following things:

restart the app
rebuild the app
reboot the host
stop app, stop nginx, start app, start nginx

All still result in the above error message.
When I attach to the app via docker attach the above error message gets printed repeatedly.

As stated above, this worked already for the past 2 month.

I have no clue why the socket can not be bound anymore… I mean the old socket can not still be up and blocking because I rebooted the host maschine…
Any help would be appreciated.

Cheers, Christopher

PS: I had to put spaces behind all http and https occurrences because otherwise it would not let me create the topic.

countcb · June 2, 2015, 2:28pm

Ok, I had to delete

/var/discourse/shared/standalone/nginx.http.sock

Seemed the socket got not deleted when the hosting company restarted my machine.
Maybe they did not give the time for everything to gracefully shutdown.

Sometimes it helps talking/writing about something.
I was looking to find this for several hours already… That it would be so simple in the end…

Glad it works again!

riking · June 2, 2015, 11:19pm

We should have the rebuild delete that file.

KazWolfe · June 3, 2015, 6:44am

Or have stop delete that file.

Or have start delete the file if it already exists.

countcb · June 3, 2015, 10:35am

I like both approaches. Rebuild should in any case delete that file.

But in my opinion it would do no harm to delete the file also when the app starts (only if no instance is already running).
The file is inside /var/discourse. So chances that anything else but discourse is using this file are very unlikely (and in any case that would be the result of an error somewhere else).

rriemann · June 9, 2015, 1:36pm

I have the same issue. First, I thought that my own nginx has to be started a while after the docker containers (yet to confirm), but I found out then that the issue is related to the sock file.

From what I found on stackoverflow this issue seems to be a nginx one that is supposed to delete the sock file on termination. The discourse container has nginx v1.7.x Maybe an update to v1.8.x + changing the init.d script would do the trick?

Have a look:

mr8 · August 26, 2015, 2:21am

I had this issue today, and this saved us!
Thanks so much!

_vincent · August 27, 2018, 10:43am

Bumping this.

When the docker service is stopped, the socket file isn’t removed – it turns into a directory.
I had to rmdir /var/discourse/shared/standalone/nginx.http.sock to be able to start my container.

This happens after a reboot or an upgrade of docker-ce.
Note that in my setup the socket is bound to another container (front nginx).

sam · August 28, 2018, 12:42am

You know there is a reason we have this line in our template

https://github.com/discourse/discourse_docker/blob/master/templates/web.socketed.template.yml#L8-L13

_vincent · August 28, 2018, 10:38am

I get that but is it enough?

I don’t think everyone rebuilds every container after a reboot. Well at least not the people in this topic

sam · August 28, 2018, 11:25am

That is enough cause every time you stop or start the container cleanup runs

_vincent · August 28, 2018, 12:11pm

Then there is a problem, because as I said:

When I start my Discourse container I get errors in the logs because it won’t bind to the socket. At that point Discourse won’t delete it nor recreate it.

I’ll investigate when this happens again and provide full logs. I’m well aware that the problem may be related to my own Docker configuration

_vincent · September 22, 2018, 12:55pm

It happened again today on my setup

Could you maybe consider adding a failsafe for the above scenario?

rmdir /shared/nginx.http*.sock

sam · September 28, 2018, 12:36am

Still not following why the existing failsafe that runs on boot is not fixing it up for you, is there anything custom about your template

_vincent · September 28, 2018, 8:34am

Because rm (without -r) won’t remove a directory.

sam · September 28, 2018, 8:50am

I still don’t understand the sock files are not directories…

_vincent · September 28, 2018, 8:52am

I’m not sure either. As I said earlier, I suppose nginx.http.sock turns into a directory after a reboot because it is mounted in another Docker container.

sam · September 28, 2018, 8:53am

This is very strange, I guess do a PR the rms those files regardless of them being files or directories, I am fine with amending the boot and shutdown scripts with that

_vincent · September 28, 2018, 9:01am

Done!

Thanks for considering it

Lure · November 11, 2023, 9:09pm

I am also experiencing this issue: I have discourse in one docker and caddy with proxy in another. It may be related to the order how app and caddy are started, but sometimes I end up with directory instead of socket.

Topic		Replies	Views
Nginx.http.sock not created Installation nginx , unsupported-install	3	1018	July 13, 2023
Nginx.http.sock never created Installation	2	478	October 12, 2022
Unix:/var/discourse/shared/standalone/nginx.http.sock (2: No such file or directory) [SOLVED] Installation	7	2547	April 2, 2020
502 Bad gateway error after switching to SSL Installation	7	17350	May 4, 2016
Server reboot necessitates disco container reboot Installation	5	751	May 15, 2022

Nginx.http.sock bind failed after reboot

Related topics