Some context here: Emails have stopped sending - end of file error
Roughly a week ago (Jan 13, 2021), emails started failing to send through Google’s smtp-relay.gmail.com server (which is allowed and intended use for Google Apps users).
Sidekiq reported the failures with EOFErrors:
Jobs::HandledExceptionWrapper: Wrapped EOFError: end of file reached
And /logs reported the failed tasks as well:
Job exception: end of file reached
Backtrace available in the other post.
===================
Investigation revealed that up to date Discourse installs are connecting to SMTP relays with ‘EHLO localhost’ - and Google started rejecting these roughly a week ago.
From tcpdump on a production instance:
0x0030: d10f f8e4 4548 4c4f 206c 6f63 616c 686f ....EHLO.localho
0x0040: 7374 0d0a st..
...
0x0030: de62 f0c3 3432 3120 342e 372e 3020 5472 .b..421.4.7.0.Tr
0x0040: 7920 6167 6169 6e20 6c61 7465 722c 2063 y.again.later,.c
0x0050: 6c6f 7369 6e67 2063 6f6e 6e65 6374 696f losing.connectio
0x0060: 6e2e 2028 4548 4c4f 2920 6a31 3673 6d34 n..(EHLO).j16sm4
0x0070: 3831 3932 3976 736d 2e31 202d 2067 736d 81929vsm.1.-.gsm
0x0080: 7470 0d0a tp..
And replicating with telnet gives the same result:
root@conversation:~# telnet smtp-relay.gmail.com 587
Trying 74.125.137.28...
Connected to smtp-relay.gmail.com.
Escape character is '^]'.
220 smtp-relay.gmail.com ESMTP ls8sm507258pjb.6 - gsmtp
ehlo localhost.localdomain
421 4.7.0 Try again later, closing connection. (EHLO) ls8sm507258pjb.6 - gsmtp
Connection closed by foreign host.
However, a domain-specific ehlo works properly:
root@conversation:~# telnet smtp-relay.gmail.com 587
Trying 74.125.137.28...
Connected to smtp-relay.gmail.com.
Escape character is '^]'.
220 smtp-relay.gmail.com ESMTP p10sm668563uaw.3 - gsmtp
ehlo conversation.sevarg.net
250-smtp-relay.gmail.com at your service, [64.227.96.27]
250-SIZE 157286400
250-8BITMIME
250-STARTTLS
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-CHUNKING
250 SMTPUTF8
======
Based on the logs, I identified the file to modify to test the fix (in the docker image):
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/mail-2.7.1/lib/mail/network/delivery_methods/smtp.rb
Changing
DEFAULTS = {
:address => 'localhost',
:port => 25,
:domain => 'localhost.localdomain',
to
DEFAULTS = {
:address => 'conversation.sevarg.net',
:port => 25,
:domain => 'conversation.sevarg.net',
resolved the issue (after an instance restart). The EHLO is now went with the domain string, and emails now send properly from my instance.
================
Desired behavior: When sending email, the default Discourse install uses the configured domain name for the initial connection to the SMTP server. Alternately, a configuration option exists to override the domain sent. If this exists, I was unable to find it by searching.