502 错误网关 - nginx/1.14.0 (Ubuntu):无法在本地找到镜像 + 守护进程响应错误

最近我将我的戴尔服务器从一个位置迁移到了另一个位置。其主操作系统/虚拟机管理器是 Proxmox VE 5.3。我有一台运行 Nginx 的虚拟机,作为反向代理为其他几台虚拟机提供服务,其中一台是 Discourse 虚拟机。

在配置新路由器后,我成功让多台虚拟机连接到了互联网(甚至无需更新 SSL 证书)。然而,在尝试通过浏览器访问 Discourse 时,我收到了 502 Bad Gateway - nginx/1.14.0 (Ubuntu) 错误。

我之前遇到过这个错误,通常以下方法之一可以解决:

重启 Discourse 虚拟机

浏览器中仍然显示 502 Bad Gateway - nginx/1.14.0 (Ubuntu)

清除浏览器 Cookie 并尝试不同浏览器

以防是本地问题……并非如此。在各种浏览器中仍然出现相同的 502 Bad Gateway - nginx/1.14.0 (Ubuntu) 错误。

Discourse 清理并检查磁盘空间

初次清理时,它删除了约 4 GB 的数据。这让我非常惊讶。也许这才是导致问题的根源?无论如何,现在当我尝试清理 Discourse 时:

> root@forum:/var/discourse# ./launcher cleanup
> WARNING! This will remove all stopped containers.
> Are you sure you want to continue? [y/N] y
> Total reclaimed space: 0B
> WARNING! This will remove all images without at least one container associated to them.
> Are you sure you want to continue? [y/N] y
> Total reclaimed space: 0B

我还要重申,我的磁盘空间并未耗尽:

Git Pull

已是最新版本。

> root@forum:/var/discourse# git pull
> Already up to date.

重启 Discourse

在这里,我发现了两个错误:

> root@forum:/var/discourse# ./launcher restart app
> 
> WARNING: We are about to start downloading the Discourse base image
> This process may take anywhere between a few minutes to an hour, depending on your network speed
> 
> Please be patient
> 
> Unable to find image 'discourse/base:2.0.20191013-2320' locally
> /usr/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
> See '/usr/bin/docker run --help'.
> Your Docker installation is not working correctly
> 
> See: `https://meta.discourse.org/t/docker-error-on-bootstrap/13657/18?u=sam`

我查看了建议的链接。它推荐执行 Git Pull重建 Discourse(稍后你会看到我的结果)。它还建议运行 Docker “Hello World” 命令。该命令可以运行(但似乎存在一些问题?):

> root@forum:/var/discourse# docker run -it --rm hello-world
> Unable to find image 'hello-world:latest' locally
> latest: Pulling from library/hello-world
> 1b930d010525: Pull complete
> Digest: sha256:c3b4ada4687bbaa170745b3e4dd8ac3f194ca95b2d0518b417fb47e5879d9b5f
> Status: Downloaded newer image for hello-world:latest
> 
> Hello from Docker!
> This message shows that your installation appears to be working correctly.
> 
> To generate this message, Docker took the following steps:
>  1. The Docker client contacted the Docker daemon.
>  2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
>     (amd64)
>  3. The Docker daemon created a new container from that image which runs the
>     executable that produces the output you are currently reading.
>  4. The Docker daemon streamed that output to the Docker client, which sent it
>     to your terminal.
> 
> To try something more ambitious, you can run an Ubuntu container with:
>  $ docker run -it ubuntu bash
> 
> Share images, automate workflows, and more with a free Docker ID:
>  https://hub.docker.com/
> 
> For more examples and ideas, visit:
>  https://docs.docker.com/get-started/
> 
> failed to resize tty, using default size

接下来,我尝试了 Ubuntu Bash……结果报错:

> root@forum:/var/discourse# docker run -it ubuntu bash
> Unable to find image 'ubuntu:latest' locally
> docker: Error response from daemon: Get https://registry-1.docker.io/v2/library/ubuntu/manifests/latest: Get https://auth.docker.io/token?scope=repository%3Alibrary%2Fubuntu%3Apull&service=registry.docker.io: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
> See 'docker run --help'.

现在尝试其他已知方法:

停止并启动 Discourse

收到的错误与 重启 时相同:

> root@forum:/var/discourse# ./launcher stop app
> 
> WARNING: We are about to start downloading the Discourse base image
> This process may take anywhere between a few minutes to an hour, depending on your network speed
> 
> Please be patient
> 
> Unable to find image 'discourse/base:2.0.20191013-2320' locally
> /usr/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
> See '/usr/bin/docker run --help'.
> Your Docker installation is not working correctly
> 
> See: `https://meta.discourse.org/t/docker-error-on-bootstrap/13657/18?u=sam`
> 
> root@forum:/var/discourse# ./launcher start app
> 
> WARNING: We are about to start downloading the Discourse base image
> This process may take anywhere between a few minutes to an hour, depending on your network speed
> 
> Please be patient
> 
> Unable to find image 'discourse/base:2.0.20191013-2320' locally
> /usr/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/discourse/base/manifests/2.0.20191013-2320: Get https://auth.docker.io/token?scope=repository%3Adiscourse%2Fbase%3Apull&service=registry.docker.io: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
> See '/usr/bin/docker run --help'.
> Your Docker installation is not working correctly
> 
> See: `https://meta.discourse.org/t/docker-error-on-bootstrap/13657/18?u=sam`

重建 Discourse

再次出现相同的两个错误。

> root@forum:/var/discourse# ./launcher rebuild app
> 
> WARNING: We are about to start downloading the Discourse base image
> This process may take anywhere between a few minutes to an hour, depending on your network speed
> 
> Please be patient
> 
> Unable to find image 'discourse/base:2.0.20191013-2320' locally
> /usr/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
> See '/usr/bin/docker run --help'.
> Your Docker installation is not working correctly
> 
> See: https://meta.discourse.org/t/docker-error-on-bootstrap/13657/18?u=sam

从之前的备份恢复

我有一份在迁移服务器和清理(如前所述,清理了约 4 GB 文件)之前创建的快照。我再次尝试了上述所有操作……结果相同(除了 清理 步骤),浏览器中仍然显示 502 Bad Gateway - nginx/1.14.0 (Ubuntu)。因此,也许这与清理无关?

在我使用 Discourse 的整个过程中,从未遇到过这两个错误。有什么建议可以帮助我解决这些错误,让 Discourse 在浏览器中正常运行吗?

您的服务器无法连接到 Docker Hub 以下载大型镜像,只有小型的 hello world 镜像下载成功。

那我该如何解决这个问题呢?我对 Docker 的使用经验不多,不知道从哪里开始。

我了解的 Docker 内容仅限于 Discourse 安装指南中包含 的命令(当然,我进行了一些修改,因为 Discourse 是安装在我服务器上的虚拟机中,而不是第三方云平台上)。

您正在运行哪个 Docker 版本(docker info)?

您能否检查您的网络栈,查看在建立到 https://registry-1.docker.io/v2/ 的连接时是否存在问题?

以下是我的 Docker 信息:

root@forum:/var/discourse# docker info
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 3
Server Version: 18.09.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.15.0-69-generic
Operating System: Ubuntu 18.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 6.805GiB
Name: forum
ID: 2RRX:ZQIT:R5AK:WNPR:VJ6Z:2EBY:PFOL:W5RD:GL3X:RUQM:YLJ4:2L2X
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

警告:不支持交换限制

关于您的问题:

我该如何操作?

@Falco 所以我一直在尝试进一步排查这个问题,现在反而更困惑了。好消息和坏消息:

好消息

在执行了几次 Discourse 重建./launcher rebuild app)后,我不再收到 Unable to find imageError response from daemon 错误!当我启动/重启 Discourse 时,没有看到任何错误:

> root@forum:/var/discourse# ./launcher start app
> 
> .+ /usr/bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=4 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.example.com -e DISCOURSE_DEVELOPER_EMAILS=admin@example.com,postmaster@example.com -e DISCOURSE_SMTP_ADDRESS=smtp.sparkpostmail.com -e DISCOURSE_SMTP_PORT=587 -e DISCOURSE_SMTP_USER_NAME=SMTP_Injection -e DISCOURSE_SMTP_PASSWORD=<HIDING-FOR-PRICACY> -e LETSENCRYPT_ACCOUNT_EMAIL=admin@example.com -h forum-app -e DOCKER_HOST_IP=100.17.0.1 --name app -t -p 8080:80 -p 8443:443 -p 2222:22 -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:96:f3:e6:e7:14 local_discourse/app /sbin/boot
> cebe89493bc79dab2c1716599629adfe3dc571c8659367e6ffa0d39b0e6d47af
> root@forum:/var/discourse# ./launcher restart app
> .+ /usr/bin/docker stop -t 10 app
> app
> 
> starting up existing container
> .+ /usr/bin/docker start app
> app

Docker 也在运行:
> root@forum:/var/discourse# systemctl status docker.service
> ● docker.service - Docker Application Container Engine
>    Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
>    Active: active (running) since Thu 2019-11-14 03:00:54 UTC; 17h ago
>      Docs: https://docs.docker.com
>  Main PID: 18721 (dockerd)
>     Tasks: 31
>    CGroup: /system.slice/docker.service
>            ├─ 1375 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8443 -container-ip 172.17.0.2 -container-port 443
>            ├─ 1387 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8080 -container-ip 172.17.0.2 -container-port 80
>            ├─ 1399 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 2222 -container-ip 172.17.0.2 -container-port 22
>            └─18721 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
> 
> Nov 14 20:13:26 forum dockerd[18721]: time="2019-11-14T20:13:26.430856242Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:28 forum dockerd[18721]: time="2019-11-14T20:13:28.597999379Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:30 forum dockerd[18721]: time="2019-11-14T20:13:30.862158413Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:32 forum dockerd[18721]: time="2019-11-14T20:13:32.978285148Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:35 forum dockerd[18721]: time="2019-11-14T20:13:35.105130149Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:37 forum dockerd[18721]: time="2019-11-14T20:13:37.151466214Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:13:39 forum dockerd[18721]: time="2019-11-14T20:13:39.024948159Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:14:05 forum dockerd[18721]: time="2019-11-14T20:14:05.179759938Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
> Nov 14 20:14:16 forum dockerd[18721]: time="2019-11-14T20:14:16.078334393Z" level=info msg="Container cebe89493bc79dab2c1716599629adfe3dc571c8659367e6ffa0d39b0e6d47af failed to exit within
> Nov 14 20:14:16 forum dockerd[18721]: time="2019-11-14T20:14:16.281731176Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"

坏消息

当我尝试在不同浏览器中打开网站时,仍然收到 502 Bad Gateway - nginx/1.14.0 (Ubuntu) 错误。我已经清除了缓存并尝试了多个浏览器,但问题依旧。

我注意到的一点是,如果我尝试执行 Discourse 清理,它会清除不少容器:

> root@forum:/var/discourse# ./launcher cleanup
> WARNING! This will remove all stopped containers.
> Are you sure you want to continue? [y/N] y
> Total reclaimed space: 0B
> WARNING! This will remove all images without at least one container associated to them.
> Are you sure you want to continue? [y/N] y
> Deleted Images:
> untagged: hello-world:latest
> untagged: hello-world@sha256:c3b4ada4687bbaa170745b3e4dd8ac3f194ca95b2d0518b417fb47e5879d9b5f
> deleted: sha256:fce289e99eb9bca977dae136fbe2a82b6b7d4c372474c9235adc1741675f587e
> deleted: sha256:af0b15c8625bb1938f1d7b17081031f649fd14e6b233688eea3c5483994a66a3
> untagged: discourse/base:2.0.20191013-2320
> untagged: discourse/base@sha256:77e010342aa5111c8c3b81d80de7d4bdb229793d595bbe373992cdb8f86ef41f
> deleted: sha256:53b44681b65ee5e9a9cadc6bd34c6aa6f6bcbbbe6270e61669c50bcd655c6898
> deleted: sha256:939a3ac6d5627270ae02a9f9ea05c580589cec0afa019b7f296fdd43157dd3a0
> 
> Total reclaimed space: 452.2MB

现在,每当我尝试启动/重启时,错误又回来了!我已经回滚到 Unable to find imageError response from daemon 错误消息不再出现的那个时间点。显然,容器被重建了,而清除镜像容器会导致问题。

如果您的 Docker 出现问题(例如 docker run ubuntu 失败),Discourse 很可能无法正常工作。

如需获取针对 Docker 的专门支持,您可以尝试在 https://forums.docker.com/ 上发帖。

@Falco 是的,运行该命令时,我仍然遇到了之前出现的“无法在本地找到镜像”和“守护进程错误响应”错误。

因此,按照您的建议,我已在 Docker 论坛发起了一个新话题。 遗憾的是,目前尚未收到回复。

还有一些最新消息:

好消息

我不再收到 /usr/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers). 错误了!

docker run ubuntu 不再报错。@Falco 我是否应该看到任何输出?

root@forum:/var/discourse# docker run ubuntu
root@forum:/var/discourse#

那么我做了什么解决了这个错误?我在此虚拟机上使用的名称服务器(DNS)替换为 Google 公共 DNS 服务:
echo "nameserver 8.8.8.8" > /etc/resolv.conf

不过,每次在虚拟机重启后,我都必须运行上述命令(看起来这只是临时的),否则当我尝试运行任何 ./launcher 命令时,会出现两个错误。之后,只会出现一个错误(Unable to find image 'discourse/base:2.0.20191013-2320' locally),但命令似乎能正常解决:

root@forum:/var/discourse# ./launcher start app

WARNING: We are about to start downloading the Discourse base image
This process may take anywhere between a few minutes to an hour, depending on your network speed

Please be patient

Unable to find image 'discourse/base:2.0.20191013-2320' locally
2.0.20191013-2320: Pulling from discourse/base
Digest: sha256:77e010342aa5111c8c3b81d80de7d4bdb229793d595bbe373992cdb8f86ef41f
Status: Downloaded newer image for discourse/base:2.0.20191013-2320

starting up existing container
+ /usr/bin/docker start app
app

我似乎也能在没有错误的情况下执行 ./launcher rebuild app。以下是该命令输出显示的最后内容(隐藏了 URL 和 MAC 地址):

+ /usr/bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=4 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e DISCOURSE_HOSTNAME=discourse.domain.com -e DISCOURSE_DEVELOPER_EMAILS=admin@domain.com,postmaster@domain.com -e DISCOURSE_SMTP_ADDRESS=smtp.sparkpostmail.com -e DISCOURSE_SMTP_PORT=587 -e DISCOURSE_SMTP_USER_NAME=SMTP_Injection -e DISCOURSE_SMTP_PASSWORD=0d431cd177ce3d35833aa823d498eb57c7c4e99c -e LETSENCRYPT_ACCOUNT_EMAIL=admin@domain.com -h forum-app -e DOCKER_HOST_IP=172.17.0.1 --name app -t -p 8080:80 -p 8443:443 -p 2222:22 -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 00:00:00:00:00:01 local_discourse/app /sbin/boot
abb788d4a6fd301d88f129189a07a19c4a6bfc8554d43c555d3e3cd126374736

坏消息

当我尝试访问页面时,仍然收到 502 Bad Gateway - nginx/1.14.0 (Ubuntu) 错误。

还可能是什么问题?有什么建议吗?

我解决了我的问题!我的 Discourse 论坛现在可以在浏览器中显示了!

更准确地说,是 Francis Day 在 Nginx 论坛上帮我解决了问题! 问题确实出在 Nginx 上。以下是我所做的步骤:

  1. 登录到我的 Nginx 虚拟机。
  2. 使用 VIM 编辑 Discourse 的配置文件:vim /etc/nginx/sites-available/discourse.conf
  3. 这是我的 Discourse 配置文件的样子(显然我没有使用通常使用的域名,其中 192.168.0.101 = Nginx 虚拟机192.168.0.104 = Discourse 虚拟机)。
  4. 我只做了一处修改:将 proxy_pass http://discourse.domainame.com:8080/; 改为 proxy_pass 192.168.0.104:8080/;。因此,proxy_pass 被设置为 Discourse 虚拟机的 本地 IP,而不是 主机名
  5. 保存 CONF 文件,然后重新加载(systemctl reload nginx.service)并重启(systemctl restart nginx.service)Nginx。
  6. 成功了!我刷新 Discourse 的 URL,它就能正常访问了!甚至不需要重启或重建 Discourse!

希望我写的这些内容对某人有所帮助,因为通过 Nginx 对 Discourse 进行反向代理确实可能非常棘手。