我今天又深入排查了一下,花了不少时间,但终于让它跑起来了。我会提供我的配置文件以及一些调试方法,希望能帮助到其他遇到同样问题的人。
注意:关于 nginx/acme-companion/discourse 容器如何“粘合”在一起并自动化证书管理和反向代理的详细信息,请参阅我的初始帖子。
如果你不关心调试方法,可以直接跳到文末,那里有我的解决方案(docker-compose 文件和 app.yml 文件)。
从我的配置中识别出的主要错误
看来我遇到的一个错误是我自己的疏忽造成的,因为盲目跟随多个教程导致配置变得混乱。如果你注意到我的初始帖子,我曾使用 **netcat -lnvp 80** 成功监听到了 discourse 应用容器的 80 端口。但这说不通:如果我希望反向代理将流量转发到 discourse 容器上正在监听的端口,那么该端口就不应该能被我用 netcat 监听……它应该是 “被占用” 的状态。
问题在于我使用了以下模板:web.socketed.template.yml。这意味着 discourse 容器中的 nginx 使用的是 Unix 套接字,而不是监听 TCP/IP 端口。该模板及其对 nginx 的配置示例如下:
ubuntu@ubuntu-vm-dev:/var/discourse/templates$ cat web.socketed.template.yml
run:
- file:
path: /etc/runit/1.d/remove-old-socket
chmod: "+x"
contents: |
#!/bin/bash
rm -f /shared/nginx.http*.sock
- file:
path: /etc/runit/3.d/remove-old-socket
chmod: "+x"
contents: |
#!/bin/bash
rm -rf /shared/nginx.http*.sock
- replace:
filename: "/etc/nginx/conf.d/discourse.conf"
from: /listen 80;/
to: |
listen unix:/shared/nginx.http.sock;
set_real_ip_from unix:;
- replace:
filename: "/etc/nginx/conf.d/discourse.conf"
from: /listen 443 ssl http2;/
to: |
listen unix:/shared/nginx.https.sock ssl http2;
set_real_ip_from unix:;
问题在于,我的 nginx-proxy 容器中的 nginx 虽然成功将流量转发到了 discourse 容器,但它转发到了 80 端口,而不是使用 共享的 Unix 套接字。
你可以在 nginx-proxy 的文档中看到,默认端口为 80,除非指定了虚拟端口。
截图附件来源:GitHub - nginx-proxy/nginx-proxy: Automated Nginx Reverse Proxy for Docker · GitHub
与其折腾配置 nginx-proxy 容器,我反而直接移除了 web.socketed.template.yml。随后,我也移除了 discourse 容器暴露的任何端口,因为这是不必要的。只要 discourse 容器位于正确的 Docker 网络中(与 nginx-proxy 和 acme-companion 在同一网络),就应该能正常工作。
调试技巧
1. 网络调试
1.1 Docker 网络
启动你的 Docker 应用以及 nginx/acme 容器后,可以使用以下命令测试网络是否成功:
ubuntu@ubuntu-vm-dev:/var/discourse$ docker network inspect nginx-proxy
{
"Name": "nginx-proxy",
"Id": "d2715f513771f002711521838340b879bb9106ef50118fb29be6b0cf2d5f25e7",
"Created": "2022-02-01T20:32:10.021632263Z",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"205772214b77a6bf49a76d082eef216aa5fc4ca15a39ae2c8fefc0b3eeb61e00": {
"Name": "nginx-test_webserver2_1",
"EndpointID": "ee3096663b1a90443906652ff3a192243fa1f60081e5bbee190a23f1d23393f7",
"MacAddress": "02:42:ac:12:00:02",
"IPv4Address": "172.18.0.2/16",
"IPv6Address": ""
},
"3050831bf600ae02c18e5045549267b583ec732d6f1550f869ab651dfc7d8dca": {
"Name": "nginx-test_nginx_1",
"EndpointID": "a7f95f2a7589ca3823da01b9d28244bbd81815d6f5cc1f2f89804a488ef2148d",
"MacAddress": "02:42:ac:12:00:03",
"IPv4Address": "172.18.0.3/16",
"IPv6Address": ""
},
"696ce8d129a62ce37ae8373ed4f603fd5d731f5f29166bd6fa889631ef0ae606": {
"Name": "nginx-proxy",
"EndpointID": "99d90539668b97652ffca7e3301aa83c40fe7b31ca8eea2a8a4d9bf7afe19ac0",
"MacAddress": "02:42:ac:12:00:04",
"IPv4Address": "172.18.0.4/16",
"IPv6Address": ""
},
"d3faf6489ca6617dceda4f2907ee6c055a1d81e3590c3eab2768601dfc0b60d7": {
"Name": "app",
"EndpointID": "aed6bfcc61e439e2206615a1569355c529a63a21c356e54da378f630d44fd3bc",
"MacAddress": "02:72:f8:ee:03:32",
"IPv4Address": "172.18.0.6/16",
"IPv6Address": ""
},
"e04f2125001d5f346255312f10cdd0ab176388cddb0819b0022c5988a23303eb": {
"Name": "letsencrypt-proxy",
"EndpointID": "0088bcd942aed41c9351b935ed7b115e49046cc3a8308ca5a8d3828a194ecfee",
"MacAddress": "02:42:ac:12:00:05",
"IPv4Address": "172.18.0.5/16",
"IPv6Address": ""
}
},
注意:将网络名称更改为你正在使用的名称(除非你也使用的是 nginx-proxy 这个名称)。
随后,你应该能在 nginx-proxy 容器的配置中看到来自 Docker 网络的这些 IP 地址,你可以通过以下方式访问并查看:
ubuntu@ubuntu-vm-dev:/var/discourse$ docker exec -it nginx-proxy /bin/bash
root@696ce8d129a6:/app# cd /etc/nginx/conf.d/
root@696ce8d129a6:/etc/nginx/conf.d# cat default.conf
... NGINX 配置将在此处 ...
要测试网络通信是否成功,请与上游位置中指定的 IP 地址进行交互。例如,从 nginx-proxy 容器出发,尝试通过分配的 Docker 网络 IP 地址与 discourse 容器进行交互。
例如,我的配置文件包含:
upstream my_forum_domain.com {
## 可以连接到 "nginx-proxy" 网络
# app
server 172.18.0.6:80;
}
由于我的 discourse 容器的上游 IP 是 172.18.0.6,当我对 discourse 容器执行 curl 时,应该能看到成功的输出。
root@696ce8d129a6:/etc/nginx/conf.d# curl 172.18.0.6
... Curl 请求输出 ...
1.2. 网络故障排除工具(Netstat, lsof, netcat)
以下工具将有助于故障排除:‘netstat’、‘lsof’、‘netcat’。由于 discourse 容器未预装这些工具,请通过以下方式安装:
root@ubuntu-vm-dev-app:/# apt-get install lsof
root@ubuntu-vm-dev-app:/# apt-get install net-tools
root@ubuntu-vm-dev-app:/# apt-get install netcat
lsof
你应该能通过 lsof 看到你的 nginx 进程正在监听 80 端口。
root@ubuntu-vm-dev-app:/# lsof -i :80
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 55 root 6u IPv4 254317 0t0 TCP *:http (LISTEN)
root@ubuntu-vm-dev-app:/#
netstat
以下 netstat 命令也有助于查看是否正确使用了进程/端口。例如,你可以看到与 HTTP (80)、redis (6379)、PostgreSQL (5432)、Unicorn (3000) 等相关的端口。
root@ubuntu-vm-dev-app:/# netstat -peanut
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 0 254317 55/nginx: master pr
tcp 0 0 127.0.0.1:3000 0.0.0.0:* LISTEN 1000 255560 -
tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN 105 255137 -
tcp 0 0 127.0.0.11:46697 0.0.0.0:* LISTEN 0 254194 -
tcp 0 0 0.0.0.0:6379 0.0.0.0:* LISTEN 106 255126 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37600 ESTABLISHED 106 255583 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37608 ESTABLISHED 106 255610 -
tcp 0 0 127.0.0.1:37632 127.0.0.1:6379 ESTABLISHED 1000 254766 -
tcp 0 0 127.0.0.1:37626 127.0.0.1:6379 ESTABLISHED 1000 254745 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37634 ESTABLISHED 106 254773 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37650 ESTABLISHED 106 255950 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37668 ESTABLISHED 106 255965 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37612 ESTABLISHED 106 255618 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37602 ESTABLISHED 106 255585 -
tcp 0 0 127.0.0.1:37624 127.0.0.1:6379 ESTABLISHED 1000 254739 -
tcp 0 0 127.0.0.1:37594 127.0.0.1:6379 ESTABLISHED 1000 255578 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37594 ESTABLISHED 106 254527 -
tcp 0 0 127.0.0.1:37618 127.0.0.1:6379 ESTABLISHED 1000 255635 -
tcp 0 0 127.0.0.1:37586 127.0.0.1:6379 ESTABLISHED 1000 255168 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37622 ESTABLISHED 106 255645 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37624 ESTABLISHED 106 254740 -
tcp 0 0 127.0.0.1:37600 127.0.0.1:6379 ESTABLISHED 1000 254537 -
tcp 0 0 127.0.0.1:37610 127.0.0.1:6379 ESTABLISHED 1000 254563 -
tcp 0 0 127.0.0.1:3000 127.0.0.1:36678 TIME_WAIT 0 0 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37620 ESTABLISHED 106 255642 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37636 ESTABLISHED 106 254779 -
tcp 0 0 127.0.0.1:37628 127.0.0.1:6379 ESTABLISHED 1000 254751 -
tcp 0 0 127.0.0.1:37590 127.0.0.1:6379 ESTABLISHED 1000 254361 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37626 ESTABLISHED 106 254746 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37586 ESTABLISHED 106 255169 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37590 ESTABLISHED 106 255173 -
tcp 0 0 127.0.0.1:37668 127.0.0.1:6379 ESTABLISHED 1000 254849 -
tcp 0 0 127.0.0.1:37622 127.0.0.1:6379 ESTABLISHED 1000 254582 -
tcp 0 0 127.0.0.1:37634 127.0.0.1:6379 ESTABLISHED 1000 254772 -
tcp 0 0 127.0.0.1:37602 127.0.0.1:6379 ESTABLISHED 1000 254541 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37630 ESTABLISHED 106 254760 -
tcp 0 0 127.0.0.1:37612 127.0.0.1:6379 ESTABLISHED 1000 255617 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37628 ESTABLISHED 106 254752 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37610 ESTABLISHED 106 255612 -
tcp 0 0 127.0.0.1:3000 127.0.0.1:36682 TIME_WAIT 0 0 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37632 ESTABLISHED 106 254767 -
tcp 0 0 127.0.0.1:37650 127.0.0.1:6379 ESTABLISHED 1000 254826 -
tcp 0 0 127.0.0.1:37608 127.0.0.1:6379 ESTABLISHED 1000 254559 -
tcp 0 0 127.0.0.1:37620 127.0.0.1:6379 ESTABLISHED 1000 255641 -
tcp 0 0 127.0.0.1:37636 127.0.0.1:6379 ESTABLISHED 1000 254778 -
tcp 0 0 127.0.0.1:37630 127.0.0.1:6379 ESTABLISHED 1000 254759 -
tcp 0 0 127.0.0.1:6379 127.0.0.1:37618 ESTABLISHED 106 255636 -
tcp6 0 0 :::5432 :::* LISTEN 105 255138 -
tcp6 0 0 :::6379 :::* LISTEN 106 255127 -
udp 0 0 127.0.0.1:54569 127.0.0.1:54569 ESTABLISHED 105 255145 -
udp 0 0 127.0.0.11:44019 0.0.0.0:* 0 254193 -
root@ubuntu-vm-dev-app:/#
nc
使用 nc 查看监听端口是否配置正确/正在监听。(虽然 netstat 也应该显示这些信息)。
root@ubuntu-vm-dev-app:/# nc -zv localhost 1-10000 2>&1 | grep -i succeeded
Connection to localhost (127.0.0.1) 80 port [tcp/http] succeeded!
Connection to localhost (127.0.0.1) 3000 port [tcp/*] succeeded!
Connection to localhost (127.0.0.1) 5432 port [tcp/postgresql] succeeded!
Connection to localhost (127.0.0.1) 6379 port [tcp/redis] succeeded!
1.3 进程
另一个有用的命令:ps aux
root@ubuntu-vm-dev-app:/# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 6772 3252 pts/0 Ss+ 15:20 0:00 /bin/bash /sbin/boot
root 41 0.0 0.0 2340 636 pts/0 S+ 15:20 0:00 /usr/bin/runsvdir -P /etc/service
root 42 0.0 0.0 2188 696 ? Ss 15:20 0:00 runsv cron
root 43 0.0 0.0 2188 636 ? Ss 15:20 0:00 runsv rsyslog
root 44 0.0 0.0 2188 632 ? Ss 15:20 0:00 runsv nginx
root 45 0.0 0.0 2188 636 ? Ss 15:20 0:00 runsv postgres
root 46 0.0 0.0 2188 636 ? Ss 15:20 0:00 runsv redis
root 47 0.0 0.0 2188 636 ? Ss 15:20 0:00 runsv unicorn
root 48 0.0 0.0 151068 3868 ? Sl 15:20 0:00 rsyslogd -n
root 49 0.0 0.0 2336 700 ? S 15:20 0:00 svlogd /var/log/redis
root 50 0.0 0.0 6620 2768 ? S 15:20 0:00 cron -f
redis 51 0.1 0.0 54180 5828 ? Sl 15:20 0:02 /usr/bin/redis-server *:6379
root 52 0.0 0.0 2336 636 ? S 15:20 0:00 svlogd /var/log/postgres
postgres 53 0.0 0.3 213196 28676 ? S 15:20 0:00 /usr/lib/postgresql/13/bin/postmaster -D /etc/postgresql/13/main
discour+ 54 0.0 0.0 15256 3924 ? S 15:20 0:00 /bin/bash config/unicorn_launcher -E production -c config/unicorn.conf.rb
root 55 0.0 0.0 21280 7128 ? S 15:20 0:00 nginx: master process /usr/sbin/nginx
www-data 69 0.0 0.0 22156 5004 ? S 15:20 0:00 nginx: worker process
www-data 70 0.0 0.0 22124 4524 ? S 15:20 0:00 nginx: worker process
www-data 71 0.0 0.0 21656 3768 ? S 15:20 0:00 nginx: cache manager process
postgres 75 0.0 0.0 213296 6344 ? Ss 15:20 0:00 postgres: 13/main: checkpointer
postgres 76 0.0 0.0 213196 5896 ? Ss 15:20 0:00 postgres: 13/main: background writer
postgres 77 0.0 0.1 213196 10028 ? Ss 15:20 0:00 postgres: 13/main: walwriter
postgres 78 0.0 0.1 213736 8492 ? Ss 15:20 0:00 postgres: 13/main: autovacuum launcher
postgres 79 0.0 0.0 67960 5548 ? Ss 15:20 0:00 postgres: 13/main: stats collector
postgres 80 0.0 0.0 213752 6868 ? Ss 15:20 0:00 postgres: 13/main: logical replication launcher
discour+ 81 0.4 3.1 436276 253572 ? Sl 15:20 0:08 unicorn master -E production -c config/unicorn.conf.rb
postgres 99 0.0 0.3 220252 30612 ? Ss 15:20 0:00 postgres: 13/main: discourse discourse [local] idle
discour+ 116 0.6 3.7 831804 306708 ? SNl 15:20 0:12 sidekiq 6.4.1 discourse [0 of 5 busy]
discour+ 125 0.4 3.6 750356 297108 ? Sl 15:20 0:07 unicorn worker[0] -E production -c config/unicorn.conf.rb
discour+ 133 0.3 3.5 737812 289896 ? Sl 15:20 0:07 unicorn worker[1] -E production -c config/unicorn.conf.rb
postgres 147 0.0 0.3 219372 27792 ? Ss 15:20 0:00 postgres: 13/main: discourse discourse [local] idle
root 1858 0.0 0.0 7036 3828 pts/1 Ss 15:46 0:00 /bin/bash
postgres 1982 0.0 0.2 217012 24008 ? Ss 15:47 0:00 postgres: 13/main: discourse discourse [local] idle
discour+ 2236 0.0 0.0 13760 2216 ? S 15:51 0:00 sleep 1
root 2237 0.0 0.0 9636 3292 pts/1 R+ 15:51 0:00 ps aux
root@ubuntu-vm-dev-app:/#
注意:提供上述输出的原因是为了帮助你发现你的部署与我这里的任何异常或差异。或者,你可以启动一个自己的正常运行的部署(不带反向代理),并比较以下内容:
- 网络
- 进程
- 文件创建/修改
- 日志输出
我的正常工作配置文件
1. 执行 ./launcher rebuild app 后的日志输出示例
+ /usr/bin/docker run --shm-size=512m -d --restart=always -e LANG=en_US.UTF-8 -e RAILS_ENV=production -e UNICORN_WORKERS=2 -e UNICORN_SIDEKIQS=1 -e RUBY_GLOBAL_METHOD_CACHE_SIZE=131072 -e RUBY_GC_HEAP_GROWTH_MAX_SLOTS=40000 -e RUBY_GC_HEAP_INIT_SLOTS=400000 -e RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.5 -e DISCOURSE_DB_SOCKET=/var/run/postgresql -e DISCOURSE_DB_HOST= -e DISCOURSE_DB_PORT= -e VIRTUAL_HOST={your_domain}.com -e LETSENCRYPT_HOST={your_domain}.com -e LETSENCRYPT_EMAIL=steve@{your_email_domain}.com -e LC_ALL=en_US.UTF-8 -e LANGUAGE=en_US.UTF-8 -e EMBER_CLI_PROD_ASSETS=1 -e DISCOURSE_HOSTNAME={your_domain}.com -e DISCOURSE_DEVELOPER_EMAILS=steve@{your_email_domain}.com -e DISCOURSE_SMTP_ADDRESS=smtp.mailgun.org -e DISCOURSE_SMTP_PORT=587 -e DISCOURSE_SMTP_USER_NAME=noreply@mail.{your_domain}.com -e DISCOURSE_SMTP_PASSWORD={your_password} -e DISCOURSE_SMTP_DOMAIN=mail.{your_domain}.com -e DISCOURSE_NOTIFICATION_EMAIL=noreply@mail.{your_domain}.com -h ubuntu-vm-dev-app -e DOCKER_HOST_IP=172.17.0.1 --name app -t -v /var/discourse/shared/standalone:/shared -v /var/discourse/shared/standalone/log/var-log:/var/log --mac-address 02:72:f8:ee:03:32 --network nginx-proxy local_discourse/app /sbin/boot
d3faf6489ca6617dceda4f2907ee6c055a1d81e3590c3eab2768601dfc0b60d7
ubuntu@ubuntu-vm-dev:/var/discourse$
2. 我的 docker-compose 文件
ubuntu@ubuntu-vm-dev:~/nginx-test$ cat docker-compose.yml
version: '3'
services:
nginx-proxy:
image: jwilder/nginx-proxy
container_name: nginx-proxy
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/tmp/docker.sock:ro
- letsencrypt-certs:/etc/nginx/certs
- letsencrypt-vhost-d:/etc/nginx/vhost.d
- letsencrypt-html:/usr/share/nginx/html
letsencrypt-proxy:
image: jrcs/letsencrypt-nginx-proxy-companion
container_name: letsencrypt-proxy
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- letsencrypt-certs:/etc/nginx/certs
- letsencrypt-vhost-d:/etc/nginx/vhost.d
- letsencrypt-html:/usr/share/nginx/html
environment:
- DEFAULT_EMAIL=steve@{your_email_domain}.com
- NGINX_PROXY_CONTAINER=nginx-proxy
networks:
default:
external:
name: nginx-proxy
volumes:
letsencrypt-certs:
letsencrypt-vhost-d:
letsencrypt-html:
3. 我的 discourse app.yml 文件
ubuntu@ubuntu-vm-dev:/var/discourse$ cat containers/app.yml
templates:
- "templates/postgres.template.yml"
- "templates/redis.template.yml"
- "templates/sshd.template.yml"
- "templates/web.template.yml"
docker_args:
- "--network nginx-proxy"
params:
db_default_text_search_config: "pg_catalog.english"
db_shared_buffers: "128MB"
env:
VIRTUAL_HOST: {your_domain}.com
LETSENCRYPT_HOST: example.com
LETSENCRYPT_EMAIL: steve@{your_email_domain}.com
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LANGUAGE: en_US.UTF-8
EMBER_CLI_PROD_ASSETS: 1
# DISCOURSE_DEFAULT_LOCALE: en
UNICORN_WORKERS: 2
DISCOURSE_HOSTNAME: example.com
DISCOURSE_DEVELOPER_EMAILS: 'steve@{email_domain}.com'
DISCOURSE_SMTP_ADDRESS: smtp.mailgun.org
DISCOURSE_SMTP_PORT: 587
DISCOURSE_SMTP_USER_NAME: noreply@mail.{domain.com}.com
DISCOURSE_SMTP_PASSWORD: "{password}"
#DISCOURSE_SMTP_ENABLE_START_TLS: true # (optional, default true)
DISCOURSE_SMTP_DOMAIN: mail.{domain}.com
DISCOURSE_NOTIFICATION_EMAIL: noreply@mail.{domain}.com
volumes:
- volume:
host: /var/discourse/shared/standalone
guest: /shared
- volume:
host: /var/discourse/shared/standalone/log/var-log
guest: /var/log
hooks:
after_code:
- exec:
cd: $home/plugins
cmd:
- git clone https://github.com/discourse/docker_manager.git
- git clone https://github.com/discourse/discourse-spoiler-alert.git
- git clone https://github.com/discourse/discourse-solved.git
- git clone https://github.com/discourse/discourse-cakeday.git
run:
- exec: echo "Beginning of custom commands"
- exec: echo "End of custom commands"
希望以上内容能帮助到同样卡在这一步的其他人 ![]()
