Email Hostname Certificate Mismatch Causing sidekiq Queue Overload, Severe Site Instability

I followed the advice above and set the hostname to the name on the certificate.

It’s worth noting that, in this case, the problem only seems to have occurred after a laucher-initiated rebuild, rather than merely on an upgrade. Perhaps a problem with the launcher scripts?

2 Likes

Can you please tell me how did you do it?
I’m going crazy, I can’t use SMTP server with port 25 or 587 without SSL and TLS

Thanks

1 Like

I may not be able to help you then, since my configuration doesn’t require TLS. I think the only thing to do is either use a third-party email provider that provides valid certs, or wait for a fix that allows bypassing this issue.

1 Like

Did you try Richard’s dig command to find a hostname for your SMTP server for which it has a certificate?

1 Like

Mine is also without TLS and SSL :slight_smile:

1 Like

Similar issue here Can't Send Emails - #14 by sukria.
Did something change in the base image or in an external library or gem?

6 Likes

Yes that’s right, it’s the same problem … it started about two weeks ago.

1 Like

Can you try both

DISCOURSE_SMTP_ENABLE_START_TLS: false 
DISCOURSE_SMTP_OPENSSL_VERIFY_MODE: none

?

2 Likes

Are the first things I tried but still the same error

SSL_connect returned=1 errno=0 state=error: certificate verify failed (Hostname mismatch)
1 Like

Hey, I tried it with both the options. It still doesn’t work:

  DISCOURSE_SMTP_ADDRESS: REDACTED
  DISCOURSE_SMTP_PORT: 25
  DISCOURSE_SMTP_USER_NAME: REDACTED
  DISCOURSE_SMTP_PASSWORD: REDACTED
  DISCOURSE_SMTP_ENABLE_START_TLS: false           # (optional, default true)
  DISCOURSE_SMTP_OPENSSL_VERIFY_MODE: none
  DISCOURSE_SMTP_AUTHENTICATION: "login"

I still get certificate verify failed (self signed certificate).

2 Likes

For me it has been a blocking bug for a long time …
I recommend you to create a new temporary email address that has SMTP TLS support.

Could this be related to this gem

4 Likes

I have the exact same problem. It started yesterday, when I upgraded (via rebuild) to 2.9.0.beta4 (a5779a7d0b). I made NO changes to app.yml, or anything else. Just a rebuild.

I now have over 1,300 failed jobs.

I’m seeing SSL errors in the logs (see below for screenshots), and I’m wondering if the rebuild is suddenly ignoring the DISCOURSE_SMTP_ENABLE_START_TLS flag?

This is what I’ve “always” had in my app.yml file: (again, no changes have been made)

  DISCOURSE_SMTP_ADDRESS: 172.17.0.1
  DISCOURSE_SMTP_PORT: 25
  DISCOURSE_SMTP_AUTHENTICATION: none
  DISCOURSE_SMTP_ENABLE_START_TLS: false           # (optional, default true)

EDIT: This is what I see in the email logs for the host (the email server). The error messages are new, starting after the rebuild.

The last message regarding Discourse in the email logs before the rebuild:

May 23 17:16:02 localhost postfix/smtpd[5247]: connect from discourse-docker[172.17.0.2]
May 23 17:16:02 localhost postfix/smtpd[5247]: 0D803B67FB: client=discourse-docker[172.17.0.2]
May 23 17:16:02 localhost postfix/cleanup[5279]: 0D803B67FB: message-id=<topic/421230/2413438.f609f9d756c226a154de43f4@forums.jag-lovers.com>
May 23 17:16:02 localhost postfix/smtpd[5247]: disconnect from discourse-docker[172.17.0.2] ehlo=1 mail=1 rcpt=1 data=1 quit=1 commands=5

The first entry in the email logs on the server after the rebuild:

May 23 17:22:48 localhost postfix/smtpd[10929]: connect from discourse-docker[172.17.0.2]
May 23 17:22:48 localhost postfix/smtpd[10929]: SSL_accept error from discourse-docker[172.17.0.2]: -1
May 23 17:22:48 localhost postfix/smtpd[10929]: warning: TLS library problem: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate:../ssl/record/rec_layer_s3.c:1528:SSL alert number 42:
May 23 17:22:48 localhost postfix/smtpd[10929]: lost connection after STARTTLS from discourse-docker[172.17.0.2]
May 23 17:22:48 localhost postfix/smtpd[10929]: disconnect from discourse-docker[172.17.0.2] ehlo=1 starttls=0/1 commands=1/2

After that time the entries for Discourse in the email logs all look like that.

Screenshots:

4 Likes

I tried sending a message from inside the Discourse Docker container using curl. Once I made sure to specify plaintext SMTP and port 25, I can send email via the host just fine:

$ cd /var/discourse/
$ sudo ./launcher enter app
x86_64 arch detected.
root@discourse-app:/var/www/discourse# curl smtp://172.17.0.1 --mail-from discourse@mydomain.com --mail-rcpt myname@gmail.com --upload-file README.md
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7077    0     0  100  7077      0   575k --:--:-- --:--:-- --:--:--  575k
root@discourse-app:/var/www/discourse#

And this is what that test looked like in the host’s email logs:

May 24 16:53:49 localhost postfix/smtpd[25494]: connect from discourse-docker[172.17.0.2]
May 24 16:53:49 localhost postfix/smtpd[25494]: EB62CB5FCD: client=discourse-docker[172.17.0.2]
May 24 16:53:49 localhost postfix/cleanup[26008]: EB62CB5FCD: message-id=<>
May 24 16:53:49 localhost opendkim[1365]: EB62CB5FCD: can't determine message sender; accepting
May 24 16:53:49 localhost postfix/smtpd[25494]: disconnect from discourse-docker[172.17.0.2] ehlo=1 mail=1 rcpt=1 data=1 quit=1 commands=5

Given that I have specified no TLS and port 25 in my app.yml, and this worked until the rebuild yesterday, it’s looking more and more like the latest Discourse is ignoring my SMTP configuration in app.yml.

2 Likes

Bump, @pfaffman and/or @codinghorror.

Do you think we may have a bug here, or something else?

1 Like

@gunnar I moved your post here since this is the email issue you’re describing.
I am not sure if the “post has already been taken” error is also being caused by this, but the details you gave about your email belong to this issue.

2 Likes

It seems absurd to me that after 30 days there is still this problem…
I had to change my email provider to get my forum working again.

1 Like

That is frustrating, but it looks to me like some gem no longer supports ignoring b invalid certs and/or unencrypted transport. It may just be the case that the days of being able to send mail that way are over. But I’m not experiencing the problem myself, so I haven’t looked carefully enough to know if I’m right.

2 Likes

Is there a way to “downgrade” discourse to an older working version (say 2.8.0 stable or 2.9.0 beta3) until this is worked out?

1 Like

I decided to spend one more half hour to dig into this and I think I found the cause.

This seems to be related to the move to Rails 7, which updated net-smtp from 0.1.0 to 0.3.1, which changed the defaults.

The way the smtp gem calls net-smtp does not disable enable_starttls_auto and openssl_verify_mode, it only enables it when enabled.

Related: SMTP: allow disabling starttls_auto since it's now true by default in Ruby 3 by jeremy · Pull Request #1435 · mikel/mail · GitHub

10 Likes