502 错误网关 - nginx/1.14.0 (Ubuntu):无法在本地找到镜像 + 守护进程响应错误

最近我将我的戴尔服务器从一个位置迁移到了另一个位置。其主操作系统/虚拟机管理器是 Proxmox VE 5.3。我有一台运行 Nginx 的虚拟机,作为反向代理为其他几台虚拟机提供服务,其中一台是 Discourse 虚拟机。

在配置新路由器后,我成功让多台虚拟机连接到了互联网(甚至无需更新 SSL 证书)。然而,在尝试通过浏览器访问 Discourse 时,我收到了 502 Bad Gateway - nginx/1.14.0 (Ubuntu) 错误。

我之前遇到过这个错误,通常以下方法之一可以解决:

重启 Discourse 虚拟机

浏览器中仍然显示 502 Bad Gateway - nginx/1.14.0 (Ubuntu)

清除浏览器 Cookie 并尝试不同浏览器

以防是本地问题……并非如此。在各种浏览器中仍然出现相同的 502 Bad Gateway - nginx/1.14.0 (Ubuntu) 错误。

Discourse 清理并检查磁盘空间

初次清理时,它删除了约 4 GB 的数据。这让我非常惊讶。也许这才是导致问题的根源?无论如何,现在当我尝试清理 Discourse 时:

> root@forum:/var/discourse# ./launcher cleanup
> WARNING! This will remove all stopped containers.
> Are you sure you want to continue? [y/N] y
> Total reclaimed space: 0B
> WARNING! This will remove all images without at least one container associated to them.
> Are you sure you want to continue? [y/N] y
> Total reclaimed space: 0B

我还要重申,我的磁盘空间并未耗尽:

Git Pull

已是最新版本。

> root@forum:/var/discourse# git pull
> Already up to date.

重启 Discourse

在这里,我发现了两个错误:

> root@forum:/var/discourse# ./launcher restart app
> 
> WARNING: We are about to start downloading the Discourse base image
> This process may take anywhere between a few minutes to an hour, depending on your network speed
> 
> Please be patient
> 
> Unable to find image 'discourse/base:2.0.20191013-2320' locally
> /usr/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
> See '/usr/bin/docker run --help'.
> Your Docker installation is not working correctly
> 
> See: `https://meta.discourse.org/t/docker-error-on-bootstrap/13657/18?u=sam`

我查看了建议的链接。它推荐执行 Git Pull重建 Discourse(稍后你会看到我的结果)。它还建议运行 Docker “Hello World” 命令。该命令可以运行(但似乎存在一些问题?):

> root@forum:/var/discourse# docker run -it --rm hello-world
> Unable to find image 'hello-world:latest' locally
> latest: Pulling from library/hello-world
> 1b930d010525: Pull complete
> Digest: sha256:c3b4ada4687bbaa170745b3e4dd8ac3f194ca95b2d0518b417fb47e5879d9b5f
> Status: Downloaded newer image for hello-world:latest
> 
> Hello from Docker!
> This message shows that your installation appears to be working correctly.
> 
> To generate this message, Docker took the following steps:
>  1. The Docker client contacted the Docker daemon.
>  2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
>     (amd64)
>  3. The Docker daemon created a new container from that image which runs the
>     executable that produces the output you are currently reading.
>  4. The Docker daemon streamed that output to the Docker client, which sent it
>     to your terminal.
> 
> To try something more ambitious, you can run an Ubuntu container with:
>  $ docker run -it ubuntu bash
> 
> Share images, automate workflows, and more with a free Docker ID:
>  https://hub.docker.com/
> 
> For more examples and ideas, visit:
>  https://docs.docker.com/get-started/
> 
> failed to resize tty, using default size

接下来,我尝试了 Ubuntu Bash……结果报错:

> root@forum:/var/discourse# docker run -it ubuntu bash
> Unable to find image 'ubuntu:latest' locally
> docker: Error response from daemon: Get https://registry-1.docker.io/v2/library/ubuntu/manifests/latest: Get https://auth.docker.io/token?scope=repository%3Alibrary%2Fubuntu%3Apull&service=registry.docker.io: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
> See 'docker run --help'.

现在尝试其他已知方法:

停止并启动 Discourse

收到的错误与 重启 时相同:

> root@forum:/var/discourse# ./launcher stop app
> 
> WARNING: We are about to start downloading the Discourse base image
> This process may take anywhere between a few minutes to an hour, depending on your network speed
> 
> Please be patient
> 
> Unable to find image 'discourse/base:2.0.20191013-2320' locally
> /usr/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
> See '/usr/bin/docker run --help'.
> Your Docker installation is not working correctly
> 
> See: `https://meta.discourse.org/t/docker-error-on-bootstrap/13657/18?u=sam`
> 
> root@forum:/var/discourse# ./launcher start app
> 
> WARNING: We are about to start downloading the Discourse base image
> This process may take anywhere between a few minutes to an hour, depending on your network speed
> 
> Please be patient
> 
> Unable to find image 'discourse/base:2.0.20191013-2320' locally
> /usr/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/discourse/base/manifests/2.0.20191013-2320: Get https://auth.docker.io/token?scope=repository%3Adiscourse%2Fbase%3Apull&service=registry.docker.io: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
> See '/usr/bin/docker run --help'.
> Your Docker installation is not working correctly
> 
> See: `https://meta.discourse.org/t/docker-error-on-bootstrap/13657/18?u=sam`

重建 Discourse

再次出现相同的两个错误。

> root@forum:/var/discourse# ./launcher rebuild app
> 
> WARNING: We are about to start downloading the Discourse base image
> This process may take anywhere between a few minutes to an hour, depending on your network speed
> 
> Please be patient
> 
> Unable to find image 'discourse/base:2.0.20191013-2320' locally
> /usr/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
> See '/usr/bin/docker run --help'.
> Your Docker installation is not working correctly
> 
> See: https://meta.discourse.org/t/docker-error-on-bootstrap/13657/18?u=sam

从之前的备份恢复

我有一份在迁移服务器和清理(如前所述,清理了约 4 GB 文件)之前创建的快照。我再次尝试了上述所有操作……结果相同(除了 清理 步骤),浏览器中仍然显示 502 Bad Gateway - nginx/1.14.0 (Ubuntu)。因此,也许这与清理无关?

在我使用 Discourse 的整个过程中,从未遇到过这两个错误。有什么建议可以帮助我解决这些错误,让 Discourse 在浏览器中正常运行吗?

Your server can’t connect to the docker hub in order to download large images, only the small hello world succeeded.

So how do I remedy this? I don’t have much experience using Docker, so I don’t know where to start.

The only Docker stuff I know is the commands the Discourse Install included (along with modifications of course, since Discourse is installed on a VM on a Server of mine as opposed to a Third Party Cloud).

Which docker version are you running (docker info)?

Can you check your network stack for problems stablishing connections to https://registry-1.docker.io/v2/ ?

Here is my Docker info:

root@forum:/var/discourse# docker info
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 3
Server Version: 18.09.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.15.0-69-generic
Operating System: Ubuntu 18.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 6.805GiB
Name: forum
ID: 2RRX:ZQIT:R5AK:WNPR:VJ6Z:2EBY:PFOL:W5RD:GL3X:RUQM:YLJ4:2L2X
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support

As for your question:

How do I do this?

@Falco So I’ve been trying to Troubleshoot this problem further, and I’m even more confused now. Good News and Bad News:

Good News

After performing several Discourse Rebuilds (./launcher rebuild app), I no longer receive the Unable to find image or Error response from daemon errors! When I Start/Restart Discourse, I see no errors:

> root@forum:/var/discourse# ./launcher start app
> 
> .+ /usr/bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=4 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.example.com -e DISCOURSE_DEVELOPER_EMAILS=admin@example.com,postmaster@example.com -e DISCOURSE_SMTP_ADDRESS=smtp.sparkpostmail.com -e DISCOURSE_SMTP_PORT=587 -e DISCOURSE_SMTP_USER_NAME=SMTP_Injection -e DISCOURSE_SMTP_PASSWORD=<HIDING-FOR-PRICACY> -e LETSENCRYPT_ACCOUNT_EMAIL=admin@example.com -h forum-app -e DOCKER_HOST_IP=100.17.0.1 --name app -t -p 8080:80 -p 8443:443 -p 2222:22 -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:96:f3:e6:e7:14 local_discourse/app /sbin/boot
> cebe89493bc79dab2c1716599629adfe3dc571c8659367e6ffa0d39b0e6d47af
> root@forum:/var/discourse# ./launcher restart app
> .+ /usr/bin/docker stop -t 10 app
> app
> 
> starting up existing container
> .+ /usr/bin/docker start app
> app

Docker is also running:
> root@forum:/var/discourse# systemctl status docker.service
> ● docker.service - Docker Application Container Engine
>    Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
>    Active: active (running) since Thu 2019-11-14 03:00:54 UTC; 17h ago
>      Docs: https://docs.docker.com
>  Main PID: 18721 (dockerd)
>     Tasks: 31
>    CGroup: /system.slice/docker.service
>            ├─ 1375 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8443 -container-ip 172.17.0.2 -container-port 443
>            ├─ 1387 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8080 -container-ip 172.17.0.2 -container-port 80
>            ├─ 1399 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 2222 -container-ip 172.17.0.2 -container-port 22
>            └─18721 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
> 
> Nov 14 20:13:26 forum dockerd[18721]: time="2019-11-14T20:13:26.430856242Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:28 forum dockerd[18721]: time="2019-11-14T20:13:28.597999379Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:30 forum dockerd[18721]: time="2019-11-14T20:13:30.862158413Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:32 forum dockerd[18721]: time="2019-11-14T20:13:32.978285148Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:35 forum dockerd[18721]: time="2019-11-14T20:13:35.105130149Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:37 forum dockerd[18721]: time="2019-11-14T20:13:37.151466214Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:39 forum dockerd[18721]: time="2019-11-14T20:13:39.024948159Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:14:05 forum dockerd[18721]: time="2019-11-14T20:14:05.179759938Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:14:16 forum dockerd[18721]: time="2019-11-14T20:14:16.078334393Z" level=info msg="Container cebe89493bc79dab2c1716599629adfe3dc571c8659367e6ffa0d39b0e6d47af failed to exit within
> Nov 14 20:14:16 forum dockerd[18721]: time="2019-11-14T20:14:16.281731176Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"

Bad News

Still receiving the 502 Bad Gateway - nginx/1.14.0 (Ubuntu) error when I try to open the site in various Browser. I’ve cleared their caches and tried multiple Browsers just in case. No go.

One thing I have noticed is if I try a Discourse Cleanup, it wipes quite a few Containers:

> root@forum:/var/discourse# ./launcher cleanup
> WARNING! This will remove all stopped containers.
> Are you sure you want to continue? [y/N] y
> Total reclaimed space: 0B
> WARNING! This will remove all images without at least one container associated to them.
> Are you sure you want to continue? [y/N] y
> Deleted Images:
> untagged: hello-world:latest
> untagged: hello-world@sha256:c3b4ada4687bbaa170745b3e4dd8ac3f194ca95b2d0518b417fb47e5879d9b5f
> deleted: sha256:fce289e99eb9bca977dae136fbe2a82b6b7d4c372474c9235adc1741675f587e
> deleted: sha256:af0b15c8625bb1938f1d7b17081031f649fd14e6b233688eea3c5483994a66a3
> untagged: discourse/base:2.0.20191013-2320
> untagged: discourse/base@sha256:77e010342aa5111c8c3b81d80de7d4bdb229793d595bbe373992cdb8f86ef41f
> deleted: sha256:53b44681b65ee5e9a9cadc6bd34c6aa6f6bcbbbe6270e61669c50bcd655c6898
> deleted: sha256:939a3ac6d5627270ae02a9f9ea05c580589cec0afa019b7f296fdd43157dd3a0
> 
> Total reclaimed space: 452.2MB

Now anytime I try to start/restart … the errors are back! I’ve rolled back to when the Unable to find image or Error response from daemon error messages stopped showing up. Clearly Containers were rebuilt, and clearing the image containers causes issues.

If your docker is broken (aka docker run ubuntu fails) Discourse most certainly won’t work.

To get Docker specific Docker support you may try posting https://forums.docker.com/

@Falco Yeah, I’m getting the Unable to find image locally and Error response from daemon errors as I was getting before when I run that command.

So as per your suggestion, I’ve started a topic over in the Docker forums. No response yet unfortunately.

So some more news:

Good News

I’m not receiving the /usr/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers). anymore!

docker run ubuntu no longer reports an error. @Falco is there supposed to be any output I should be seeing?

root@forum:/var/discourse# docker run ubuntu
root@forum:/var/discourse#

So what have I done that resolves the error? I replaced the nameserver (DNS) used on this VM by the Google Public DNS service:
echo "nameserver 8.8.8.8">/etc/resolv.conf

I have to run the above command every time after the VM is restarted though (so it seems to just be temporary), or I will get the two errors when I try to run any ./launcher commands. Afterwards though, only one of the errors (Unable to find image 'discourse/base:2.0.20191013-2320' locally) comes up, but the command looks to resolve:

root@forum:/var/discourse# ./launcher start app

WARNING: We are about to start downloading the Discourse base image
This process may take anywhere between a few minutes to an hour, depending on your network speed

Please be patient

Unable to find image 'discourse/base:2.0.20191013-2320' locally
2.0.20191013-2320: Pulling from discourse/base
Digest: sha256:77e010342aa5111c8c3b81d80de7d4bdb229793d595bbe373992cdb8f86ef41f
Status: Downloaded newer image for discourse/base:2.0.20191013-2320

starting up existing container
+ /usr/bin/docker start app
app

I look to also be able to do a ./launcher rebuild app without errors. Here is the last thing the output of the command shows (hiding URLs and MAC Address):

+ /usr/bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=4 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.domain.com -e DISCOURSE_DEVELOPER_EMAILS=admin@domain.com,postmaster@domain.com -e DISCOURSE_SMTP_ADDRESS=smtp.sparkpostmail.com -e DISCOURSE_SMTP_PORT=587 -e DISCOURSE_SMTP_USER_NAME=SMTP_Injection -e DISCOURSE_SMTP_PASSWORD=0d431cd177ce3d35833aa823d498eb57c7c4e99c -e LETSENCRYPT_ACCOUNT_EMAIL=admin@domain.com -h forum-app -e DOCKER_HOST_IP=172.17.0.1 --name app -t -p 8080:80 -p 8443:443 -p 2222:22 -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 00:00:00:00:00:01 local_discourse/app /sbin/boot
abb788d4a6fd301d88f129189a07a19c4a6bfc8554d43c555d3e3cd126374736

Bad News

I am still getting the 502 Bad Gateway - nginx/1.14.0 (Ubuntu) error when I try going to the page.

What else could be the issue? Any suggestions?

I resolved my issue! My Discourse Forum shows up now in a Browser!

Well, to be more specific, Francis Day over on the Nginx Forums resolved my issue! It was indeed Nginx all along. Here is what I did:

  1. Logged into my Nginx VM.
  2. VIMed Discourse CONF file: vim /etc/nginx/sites-available/discourse.conf
  3. Here is what my Discourse CONF file looked like (obviously not using the Domain Name I would normally use, and where 192.168.0.101 = Nginx VM and 192.168.0.104 = Discourse VM).
  4. I only made one change: I changed proxy_pass http://discourse.domainame.com:8080/; to proxy_pass 192.168.0.104:8080/;. So proxy_pass is set to the Discourse VM’s Local IP instead of the Hostname.
  5. Save the CONF file, and then Reloaded (systemctl reload nginx.service) and Restarted (systemctl restart nginx.service) Nginx.
  6. Viola! I refresh the Discourse URL, and it comes up! Didn’t even need to Restart/Rebuild Discourse!

I hope someone finds what I’ve written here useful, as reverse proxying Discourse through Nginx can be really tricky.