Apparently some changes have occurred with the letsencrypt service as of August 1st. My certificate didn’t expire until today, so I was only affected by the changes today. The scripts that discourse uses to manage the letsencrypt service were updated to use a new service called ZeroSSL by default instead of letsencrypt. Unfortunately, ZeroSSL apparently requires registration with an email address before it will work, so this morning when my existing certificate expired, the site wouldn’t work.
After quite a bit of investigation I figured out what the issue was, and while I’ve sort of circumvented it, I don’t think what I did to fix it is necessarily the “right” fix. First, here are the error messages that I was getting in the log:
[Wed 01 Sep 2021 05:33:58 PM UTC] Reload error for :
[Wed 01 Sep 2021 05:34:03 PM UTC] Using CA: https://acme.zerossl.com/v2/DV90
[Wed 01 Sep 2021 05:34:04 PM UTC] No EAB credentials found for ZeroSSL, let's get one
[Wed 01 Sep 2021 05:34:04 PM UTC] acme.sh is using ZeroSSL as default CA now.
[Wed 01 Sep 2021 05:34:04 PM UTC] Please update your account with an email address first.
[Wed 01 Sep 2021 05:34:04 PM UTC] acme.sh --register-account -m my@example.com
[Wed 01 Sep 2021 05:34:04 PM UTC] See: https://github.com/acmesh-official/acme.sh/wiki/ZeroSSL.com-CA
[Wed 01 Sep 2021 05:34:04 PM UTC] Please check log file for more details: /shared/letsencrypt/acme.sh.log
I tried manually registering the account from within the container, and while it said it registered, when I restarted the container I got the same error. So then I tracked down the scripts and the one that does this is inside the container at /etc/runit/1.d/letsencrypt . Here is the original script:
#!/bin/bash
/usr/sbin/nginx -c /etc/nginx/letsencrypt.conf
issue_cert() {
LE_WORKING_DIR="${LETSENCRYPT_DIR}" /shared/letsencrypt/acme.sh --issue $2 -d mudtoemanor.baronshire.org --keylength $1 -w /var/www/discourse/public
}
cert_exists() {
[[ "$(cd /shared/letsencrypt/mudtoemanor.baronshire.org$1 && openssl verify -CAfile ca.cer fullchain.cer | grep "OK")" ]]
}
########################################################
# RSA cert
########################################################
issue_cert "4096"
if ! cert_exists ""; then
# Try to issue the cert again if something goes wrong
issue_cert "4096" "--force"
fi
LE_WORKING_DIR="${LETSENCRYPT_DIR}" /shared/letsencrypt/acme.sh \
--installcert \
-d mudtoemanor.baronshire.org \
--fullchainpath /shared/ssl/mudtoemanor.baronshire.org.cer \
--keypath /shared/ssl/mudtoemanor.baronshire.org.key \
--reloadcmd "sv reload nginx"
########################################################
# ECDSA cert
########################################################
issue_cert "ec-256"
if ! cert_exists "_ecc"; then
# Try to issue the cert again if something goes wrong
issue_cert "ec-256" "--force"
fi
LE_WORKING_DIR="${LETSENCRYPT_DIR}" /shared/letsencrypt/acme.sh \
--installcert --ecc \
-d mudtoemanor.baronshire.org \
--fullchainpath /shared/ssl/mudtoemanor.baronshire.org_ecc.cer \
--keypath /shared/ssl/mudtoemanor.baronshire.org_ecc.key \
--reloadcmd "sv reload nginx"
if cert_exists "" || cert_exists "_ecc"; then
grep -q 'force_https' "/var/www/discourse/config/discourse.conf" || echo "force_https = 'true'" >> "/var/www/discourse/config/discourse.conf"
fi
/usr/sbin/nginx -c /etc/nginx/letsencrypt.conf -s stop
As the email registration didn’t work I decided to put in the option to have the acme.sh script just go back to using the original letsencrypt instead of using ZeroSSL, which instructions were provided on how to do at this website: https://community.letsencrypt.org/t/the-acme-sh-will-change-default-ca-to-zerossl-on-august-1st-2021/144052
First I tried putting this as the third line in the script right after the “/usr/sbin/nginx -c /etc/nginx/letsencrypt.conf” statement:
/shared/letsencrypt/acme.sh --set-default-ca --server letsencrypt
The log showed a message that the default was changed, but right after that I got the same error about no email registered for ZeroSSL. There was a delay of about 6 seconds in the log between that message and the error messges, which makes me think that the acme.sh script must be maintaining state information via environment variables and that the subsequent executions of the command were running in a different context and thus lost the variable. So what I ended up having to do is change all the invocations of acme.sh in the letsencrypt script to add this operand to the command “–server letsencrypt”. When I did that and restarted the container a new certificate was generated by letsencrypt instead of ZeroSSL and the site came back up.
Here is the modified version of the letsencrypt script that I used:
#!/bin/bash
/usr/sbin/nginx -c /etc/nginx/letsencrypt.conf
/shared/letsencrypt/acme.sh --set-default-ca --server letsencrypt
issue_cert() {
LE_WORKING_DIR="${LETSENCRYPT_DIR}" /shared/letsencrypt/acme.sh --server letsencrypt --issue $2 -d mudtoemanor.baronshire.org --keylength $1 -w /var/www/discourse/public
}
cert_exists() {
[[ "$(cd /shared/letsencrypt/mudtoemanor.baronshire.org$1 && openssl verify -CAfile ca.cer fullchain.cer | grep "OK")" ]]
}
########################################################
# RSA cert
########################################################
issue_cert "4096"
if ! cert_exists ""; then
# Try to issue the cert again if something goes wrong
issue_cert "4096" "--force"
fi
LE_WORKING_DIR="${LETSENCRYPT_DIR}" /shared/letsencrypt/acme.sh \
--installcert \
-d mudtoemanor.baronshire.org \
--fullchainpath /shared/ssl/mudtoemanor.baronshire.org.cer \
--keypath /shared/ssl/mudtoemanor.baronshire.org.key \
--server letsencrypt \
--reloadcmd "sv reload nginx"
########################################################
# ECDSA cert
########################################################
issue_cert "ec-256"
if ! cert_exists "_ecc"; then
# Try to issue the cert again if something goes wrong
issue_cert "ec-256" "--force"
fi
LE_WORKING_DIR="${LETSENCRYPT_DIR}" /shared/letsencrypt/acme.sh \
--installcert --ecc \
-d mudtoemanor.baronshire.org \
--fullchainpath /shared/ssl/mudtoemanor.baronshire.org_ecc.cer \
--keypath /shared/ssl/mudtoemanor.baronshire.org_ecc.key \
--server letsencrypt \
--reloadcmd "sv reload nginx"
if cert_exists "" || cert_exists "_ecc"; then
grep -q 'force_https' "/var/www/discourse/config/discourse.conf" || echo "force_https = 'true'" >> "/var/www/discourse/config/discourse.conf"
fi
/usr/sbin/nginx -c /etc/nginx/letsencrypt.conf -s stop
I also left the original command I had inserted to try to set the default service just in case, but it may not be necessary.
So what needs to happen is that this script needs to be changed to either explicitly define the letsencrypt service to be used on all invocations of acme.sh, or to be able to figure out how acme.sh is saving state information so a single invocation of the default command functions, or lastly put support in for ZeroSSL and the necessity to collect and save an email address.
I’m assuming that what I did will be overwritten the next time I upgrade versions and I’ll have to do it again, if this isn’t done.
If I’ve missed anything here or this was somehow addressed in another way that doesn’t require a change to the scripts, please let me know.