I followed the advice above and set the hostname to the name on the certificate.
It’s worth noting that, in this case, the problem only seems to have occurred after a laucher-initiated rebuild, rather than merely on an upgrade. Perhaps a problem with the launcher scripts?
I may not be able to help you then, since my configuration doesn’t require TLS. I think the only thing to do is either use a third-party email provider that provides valid certs, or wait for a fix that allows bypassing this issue.
I have the exact same problem. It started yesterday, when I upgraded (via rebuild) to 2.9.0.beta4 (a5779a7d0b). I made NO changes to app.yml, or anything else. Just a rebuild.
I now have over 1,300 failed jobs.
I’m seeing SSL errors in the logs (see below for screenshots), and I’m wondering if the rebuild is suddenly ignoring the DISCOURSE_SMTP_ENABLE_START_TLS flag?
This is what I’ve “always” had in my app.yml file: (again, no changes have been made)
EDIT: This is what I see in the email logs for the host (the email server). The error messages are new, starting after the rebuild.
The last message regarding Discourse in the email logs before the rebuild:
May 23 17:16:02 localhost postfix/smtpd[5247]: connect from discourse-docker[172.17.0.2]
May 23 17:16:02 localhost postfix/smtpd[5247]: 0D803B67FB: client=discourse-docker[172.17.0.2]
May 23 17:16:02 localhost postfix/cleanup[5279]: 0D803B67FB: message-id=<topic/421230/2413438.f609f9d756c226a154de43f4@forums.jag-lovers.com>
May 23 17:16:02 localhost postfix/smtpd[5247]: disconnect from discourse-docker[172.17.0.2] ehlo=1 mail=1 rcpt=1 data=1 quit=1 commands=5
The first entry in the email logs on the server after the rebuild:
May 23 17:22:48 localhost postfix/smtpd[10929]: connect from discourse-docker[172.17.0.2]
May 23 17:22:48 localhost postfix/smtpd[10929]: SSL_accept error from discourse-docker[172.17.0.2]: -1
May 23 17:22:48 localhost postfix/smtpd[10929]: warning: TLS library problem: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate:../ssl/record/rec_layer_s3.c:1528:SSL alert number 42:
May 23 17:22:48 localhost postfix/smtpd[10929]: lost connection after STARTTLS from discourse-docker[172.17.0.2]
May 23 17:22:48 localhost postfix/smtpd[10929]: disconnect from discourse-docker[172.17.0.2] ehlo=1 starttls=0/1 commands=1/2
After that time the entries for Discourse in the email logs all look like that.
I tried sending a message from inside the Discourse Docker container using curl. Once I made sure to specify plaintext SMTP and port 25, I can send email via the host just fine:
$ cd /var/discourse/
$ sudo ./launcher enter app
x86_64 arch detected.
root@discourse-app:/var/www/discourse# curl smtp://172.17.0.1 --mail-from discourse@mydomain.com --mail-rcpt myname@gmail.com --upload-file README.md
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 7077 0 0 100 7077 0 575k --:--:-- --:--:-- --:--:-- 575k
root@discourse-app:/var/www/discourse#
And this is what that test looked like in the host’s email logs:
May 24 16:53:49 localhost postfix/smtpd[25494]: connect from discourse-docker[172.17.0.2]
May 24 16:53:49 localhost postfix/smtpd[25494]: EB62CB5FCD: client=discourse-docker[172.17.0.2]
May 24 16:53:49 localhost postfix/cleanup[26008]: EB62CB5FCD: message-id=<>
May 24 16:53:49 localhost opendkim[1365]: EB62CB5FCD: can't determine message sender; accepting
May 24 16:53:49 localhost postfix/smtpd[25494]: disconnect from discourse-docker[172.17.0.2] ehlo=1 mail=1 rcpt=1 data=1 quit=1 commands=5
Given that I have specified no TLS and port 25 in my app.yml, and this worked until the rebuild yesterday, it’s looking more and more like the latest Discourse is ignoring my SMTP configuration in app.yml.
@gunnar I moved your post here since this is the email issue you’re describing.
I am not sure if the “post has already been taken” error is also being caused by this, but the details you gave about your email belong to this issue.
That is frustrating, but it looks to me like some gem no longer supports ignoring b invalid certs and/or unencrypted transport. It may just be the case that the days of being able to send mail that way are over. But I’m not experiencing the problem myself, so I haven’t looked carefully enough to know if I’m right.