安装导致证书损坏,文件大小为零字节

假设我的网站是 example.com,它使用 FQDN,之前一切正常。由于过去 24 小时内一直在调试邮件问题,我多次重新部署和重建。

我按照 discourse/docs/INSTALL-cloud.md at main · discourse/discourse · GitHub 安装了 Discourse。

现在我的网站无法加载(80 或 443 端口无响应)。nginx 日志显示:

2019/05/11 14:49:14 [emerg] 7866#7866: cannot load certificate "/shared/ssl/example.com.cer": PEM_read_bio_X509_AUX() failed (SSL: error:0906D06C:PEM routines:PEM_read_bio:no start line:Expecting: TRUSTED CERTIFICATE)

进入应用并查看该文件,发现其为空,大小为 0 字节:

-rw-r--r-- 1 root root    0 May 11 13:59 /shared/ssl/example.com.cer
-rw------- 1 root root 3243 May 11 13:59 /shared/ssl/example.com.key

我现在束手无策,也找不到解决方案,因此在此求助:

我能否使用 Discourse Docker 设置中的内置工具触发证书续期?如果不能,我能否执行一次操作来修复此问题,并确保之后续期能按设计由设置自动处理?

是否有安装日志?我搜索过但未找到相关记录。我预期会看到与 Let’s Encrypt 相关的错误,并希望进一步调查。也许我触达了某些限制。

1 个赞

Are both the let’s encrypt and ssl templates loaded in your app.yml?

The easiest thing is usually to delete (or rename) your app.yml, see that the container isn’t running, and run discourse-setup again

Are you behind a reverse proxy or something like Cloudflare?

Yes.

Did that, same result. :frowning:

Nope, just a normal cloud server.

It’s possible that you reached keys encrypt rate limits, though if the cert was there it wouldn’t be trying again.

You might try removing the ssl and letsencrypt directories in shared/standalone

Done, rebuilt, same.

Is there really no log of the initial installation stored somewhere?

Did you unpublish :80 in some way? Either by commenting out the line in the expose: block, altering the firewall on the server, or something along those lines?

Nope, I did nothing manually to the configuration. The server is a plain, updated Ubuntu 18.04.

Ha! I did not know that ./launcher logs app would show much more than the production or nginx log.

Look at this beauty, I got into rate-limiting indeed:

run-parts: executing /etc/runit/1.d/letsencrypt
[Sat May 11 22:58:13 UTC 2019] Create account key ok.
[Sat May 11 22:58:13 UTC 2019] Registering account
[Sat May 11 22:58:15 UTC 2019] Registered
[Sat May 11 22:58:15 UTC 2019] ACCOUNT_THUMBPRINT='STRIPPED'
[Sat May 11 22:58:15 UTC 2019] Creating domain key
[Sat May 11 22:58:15 UTC 2019] The domain key is here: /shared/letsencrypt/example.com/example.com.key
[Sat May 11 22:58:15 UTC 2019] Single domain='example.com'
[Sat May 11 22:58:15 UTC 2019] Getting domain auth token for each domain
[Sat May 11 22:58:16 UTC 2019] Getting webroot for domain='example.com'
[Sat May 11 22:58:16 UTC 2019] Verifying: example.com
[Sat May 11 22:58:19 UTC 2019] Success
[Sat May 11 22:58:19 UTC 2019] Verify finished, start to sign.
[Sat May 11 22:58:19 UTC 2019] Lets finalize the order, Le_OrderFinalize: https://acme-v02.api.letsencrypt.org/acme/finalize/STRIPPED/STRIPPED
[Sat May 11 22:58:20 UTC 2019] Sign failed, finalize code is not 200.
[Sat May 11 22:58:20 UTC 2019] {
  "type": "urn:ietf:params:acme:error:rateLimited",
  "detail": "Error finalizing order :: too many certificates already issued for exact set of domains: example.com: see https://letsencrypt.org/docs/rate-limits/",
  "status": 429
}
[Sat May 11 22:58:20 UTC 2019] Please check log file for more details: /shared/letsencrypt/acme.sh.log
Error loading file ca.cer
140536865126040:error:02001002:system library:fopen:No such file or directory:bss_file.c:175:fopen('ca.cer','r')
140536865126040:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:178:
140536865126040:error:0B084002:x509 certificate routines:X509_load_cert_crl_file:system lib:by_file.c:253:
usage: verify [-verbose] [-CApath path] [-CAfile file] [-purpose purpose] [-crl_check] [-no_alt_chains] [-attime timestamp] [-engine e] cert1 cert2 ...
recognized usages:
	sslclient 	SSL client
	sslserver 	SSL server
	nssslserver	Netscape SSL server
	smimesign 	S/MIME signing
	smimeencrypt	S/MIME encryption
	crlsign   	CRL signing
	any       	Any Purpose
	ocsphelper	OCSP helper
	timestampsign	Time Stamp signing
[Sat May 11 22:58:21 UTC 2019] Single domain='example.com'
[Sat May 11 22:58:21 UTC 2019] Getting domain auth token for each domain
[Sat May 11 22:58:23 UTC 2019] Getting webroot for domain='example.com'
[Sat May 11 22:58:23 UTC 2019] example.com is already verified, skip http-01.
[Sat May 11 22:58:23 UTC 2019] Verify finished, start to sign.
[Sat May 11 22:58:23 UTC 2019] Lets finalize the order, Le_OrderFinalize: https://acme-v02.api.letsencrypt.org/acme/finalize/STRIPPED/STRIPPED
[Sat May 11 22:58:24 UTC 2019] Sign failed, finalize code is not 200.
[Sat May 11 22:58:24 UTC 2019] {
  "type": "urn:ietf:params:acme:error:rateLimited",
  "detail": "Error finalizing order :: too many certificates already issued for exact set of domains: example.com: see https://letsencrypt.org/docs/rate-limits/",
  "status": 429
}
[Sat May 11 22:58:24 UTC 2019] Please check log file for more details: /shared/letsencrypt/acme.sh.log
[Sat May 11 22:58:24 UTC 2019] Installing key to:/shared/ssl/example.com.key
[Sat May 11 22:58:24 UTC 2019] Installing full chain to:/shared/ssl/example.com.cer
cat: /shared/letsencrypt/example.com/fullchain.cer: No such file or directory
Started runsvdir, PID is 1928
ok: run: redis: (pid 1940) 0s
ok: run: postgres: (pid 1937) 0s
nginx: [emerg] cannot load certificate "/shared/ssl/example.com.cer": PEM_read_bio_X509_AUX() failed (SSL: error:0906D06C:PEM routines:PEM_read_bio:no start line:Expecting: TRUSTED CERTIFICATE)

/shared/letsencrypt/acme.sh.log is a bit more verbose but hey, this problem is clear enough now. I will salvage a previous cert from a backup and see if Discourse will pick it up on a rebuild.

However these lines hint at errors not being handled in a nice way but bleeding into following commands:

usage: verify [-verbose] [-CApath path] [-CAfile file] [-purpose purpose] [-crl_check] [-no_alt_chains] [-attime timestamp] [-engine e] cert1 cert2 ...

and

cat: /shared/letsencrypt/example.com/fullchain.cer: No such file or directory

That should probably get some error handling?

2 个赞