Reducing backscatter in email interface?

This is a bit of a braindump, under the assumption there isn’t a “this already works if you click this checkbox” solution out there.

I’m experimenting with Discourse and its Mailing List Mode (which are both pretty great!), and I was thinking about some minor improvements with email backscatter.

I’m using the mail-receiver Docker container that Discourse offers, not POP3 polling. What I’m going to describe here only applies to handling one’s own incoming SMTP traffic.

The gist is this: Discourse is smart enough to reject incoming mail if the user isn’t allowed to post somewhere, or if the user wrote to a bogus email address. It sends a polite bounce email, but this has unintended consequences:

  • Spammers hitting the mail server will generate bounces, either for bogus addresses or real people that didn’t actually try to interact with Discourse, so you can get a flurry of backscatter, which is bad.
  • Legit email from legit users that made legit mistakes will get a legit bounce, and GMail will quietly put it in the user’s spam folder (at least, it did for me).

A better solution would be to reject mail during the SMTP transaction, when reasonable to do so. Making that work will:

  • Reduce unnecessary email bounce traffic due to joe-job spammers, etc.
  • Prevent risk of damage to a domain’s spam reputation.
  • Give users a more visible and immediate notice that something went wrong (email providers will tend to put a THIS DIDN’T SEND email in the user’s inbox).

Even if we just use the mail-receiver Docker container that Discourse offers, this still needs some support from the Discourse instance to make this happen. I couldn’t find any API to do this, but it would be useful if we could query with the system API key for something like this:

  • Can a@b.c send mail to x@y.z?

And this returns a boolean (yes or no on whether mail is acceptable) and maybe a simple string of why ("x@y.z isn’t a valid reply-to-thread slug, x@y.z isn’t a create-new-topic address, a@b.c isn’t allowed to post here, etc. It could even just be the existing “Email::Receiver::StrangersNotAllowedError” style error strings).

This could be more fine-grained (“is a@b.c a valid user?” “is x@y.z an available email target?”), but ultimately we want an API that says “should I drop this email right now or send it through to the system?” …it doesn’t have to, and probably shouldn’t, deal with rejections for things that need further processing (email is too short rejections, etc). We really just need to catch the obvious spam and malware vectors here.

Then that API call can be hooked up to /usr/local/bin/receive-mail in the mail-receiver Docker container, which can optionally return a permanent fail to the SMTP client instead of sending the email content to Discourse at all.

Does this sound like a reasonable approach? Does something like this exist already?

9 Likes

Nothing like that exists already. It wouldn’t be impossible to add, but it wouldn’t be trivial, either. Postfix has some excellent hooks for allowing SMTP-time checks to occur, we’re just not using them because mail-receiver was very much a “simplest thing that could possibly work” implementation.

If you’d like to work on this, I’m happy to provide guidance, code review, and testing. I’d like to see it happen, it’s just not something that I think anyone here at CDCK World HQ will have any bandwidth to work on any time soon.

9 Likes

I’ve posted a PR for the new API inside of Discourse here.

If I’m understanding Postfix correctly, we’d need something that talks over the Milter protocol or SMTP to properly reject email (it’s too late when /usr/local/bin/receive-mail runs). That seemed non-trivial, at least for the moment, so I’ll wait for feedback on the PR before worrying about that part.

3 Likes

I’ll leave the core PR to those more experienced in the app, but for the Postfix integration, yes, you need something that will talk milter, SMTP or Postfix policy protocol to Postfix and HTTP to Discourse, to translate. I’d strongly recommend going the Postfix policy route, as the protocol is easy to parse, and you can avoid the need to manage spawning your own daemons by having the Postfix master process do it for you.

4 Likes

or Postfix policy protocol

Oh wow, this policy protocol thing is so much less painful! Thanks for mentioning that. :slight_smile:

Ok, I wrote a script to handle the SMTP portions of all this magic.

First, you’ll need the patch in the PR applied, and a sv restart unicorn or reboot or whatever to pick up the changes.

Then get this script:

discourse-smtp-fast-rejection.rb.gz (1.3 KB)

In the mail-receiver Docker container, you’ll want to put that file at /usr/local/bin/discourse-smtp-fast-rejection.rb, perhaps with a less verbose name if you want.

/etc/postfix/mail-receiver-environment.json in the mail-receiver Docker container needs a DISCOURSE_SMTP_SHOULD_REJECT_ENDPOINT entry, or just hardcode it in the script appropriately. It should look like this:

https://discourse.example.com/admin/email/smtp_should_reject.json

I don’t know how/where that JSON file gets generated. I ended up hardcoding it in my copy for now.

Add this to the end of /etc/postfix/master.cf

policy  unix  -       n       n       -       -       spawn
    user=nobody argv=/usr/local/bin/discourse-smtp-fast-rejection.rb

…and this to the end of /etc/postfix/main.cf

smtpd_recipient_restrictions = check_policy_service unix:private/policy

And run postfix reload inside the container, or just reboot or whatever.

Note that there’s a bug in my PR about reply-to addresses, but once that’s sorted out, this work might be complete. The mail-receiver container’s SMTP server is rejecting email appropriately before sending it on to Discourse. :slight_smile:

4 Likes

A PR against mail-receiver would be greatly appreciated, once the core changes have gone in.

6 Likes

Okay, this is all sitting in various PRs now. You’ve probably gotten 12 emails with the links at this point, :), but here they are one more time for the general public:

https://github.com/discourse/discourse/pull/4793

https://github.com/discourse/mail-receiver/pull/2

https://github.com/discourse/discourse_docker/pull/344

4 Likes

This looks pretty slick, nice work.

There is mention of testing… I’ve got a new discourse instance that’s up and running now, but not scheduled to be in production for a week or so. I can offer it for real-world testing (and can grant access as needed).

As for the setup of this, looking at the new variable in the mail-receiver, I wonder:

If DISCOURSE_SMTP_SHOULD_REJECT_ENDPOINT is going to be optional, should that be commented out by default?

If it should be on by default, maybe there should be just one variable to set, like

DISCOURSE_ADMIN_URL

I’m suggesting admin URL as it would have to take in to account any subfolder. Using this, the paths to /email/handle_mail and email/smtp_should_reject.json could be implied by the mail-receiver itself.

2 Likes

I was pondering that myself; rather than having to list a whole bunch of URLs all over the place, when the internal structure is known and fixed. @icculus, if you’re feeling frisky that would be a nice change to make, although in the interests of getting things merged I wouldn’t consider it a blocker, it can always be done in the future.

3 Likes

This is a good idea, I’ll fix this up in the morning.

I’ve got a live instance running this already, and I’m trying to decide if there’s any reason I can’t just blow away my existing mail-receiver and build a new one to make sure I didn’t screw up the new parts of the Docker container configuration.

@mpalmer, I assume this line in mail-receiver.yml pulls in the contents of GitHub - discourse/mail-receiver from somewhere?

base_image: discourse/mail-receiver:1.0.0

Is there an easy way to override that, so it picks up my changes? Or is that literally just cloning from the github repo when it sees that?

1 Like

That’s a reference to the docker hub image that comes pre-built. You can build your own image based on your changes, by running docker build -t my-funky-mail-receiver . in the mail-receiver repo, and then change base_image to my-funky-mail-receiver in the mail-receiver.yml, and everything will Just Work.

1 Like

If it should be on by default, maybe there should be just one variable to set, like
DISCOURSE_ADMIN_URL

I went further and changed it to DISCOURSE_BASE_URL; we can tack on the “admin” part too. :slight_smile:

The PRs are updated with that now.

3 Likes

change base_image to my-funky-mail-receiver in the mail-receiver.yml, and everything will Just Work.

This worked like a champ, thank you; discourse.libsdl.org is up and running with a freshly built mail-receiver container, so we can say with some confidence that the latest PR is safe to merge.

You can send email to any address @discourse.libsdl.org to get a rejection to see it working on your end. :slight_smile:

5 Likes

Should I work on this, or do you have it @icculus?

3 Likes

If you have time now, go for it!

Okay – just lemme know if you do before I do…I might do it if I get bored…maybe…

Hey, reviving an old thread here, because some form of the never-applied mail-receiver patch did eventually make it into revision control, and as things bitrot over time, my patched container stopped working at some point.

I blew away my mail-receiver container, set up a fresh container and API key, and I have incoming mail working on my Discourse instance again…but the system seems to be backscattering rejected mail like it’s 2017 again.

Sure enough, these rejection replies both came back to my inbox, even though the BadDestinationAddress one should have been rejected at the SMTP level before being sent on to Discourse for further processing and a reply email. If a spammer hits this server with a bogus email address, this would generate backscatter.

Talking directly to the SMTP server, I can see it doesn’t make any attempt to reject bogus emails.

root@discourse:/var/discourse# telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 ESMTP server
HELO sdfsdfsdf
250 discourse-mail-receiver.localdomain
MAIL FROM: sdfsdf@example.com
250 2.1.0 Ok
RCPT TO: sdfsdfsdf@discourse.libsdl.org
250 2.1.5 Ok

…all this to say: I can see the default mail-receiver image has a fast-rejection script hooked up, but it doesn’t seem to be rejecting things…?

/etc/postfix/master.cf:

policy     unix  -       n       n       -       -       spawn user=nobody
    argv=/usr/local/bin/discourse-smtp-fast-rejection

/etc/postfix/main.cf:

smtpd_recipient_restrictions = check_policy_service unix:private/policy

Is there something I need to tweak to make this work, or is there some way to dig deeper into why it isn’t working? Is it working for other people?

Thanks!

1 Like

Bump…is anyone else having this problem?

I am also seeing this, which I noticed as being problematic when bogus spam recently used the reply to address (replies+abc123@instance) for From, triggering Discourse to create a series of automatic responses to itself.

From what I can tell, all of the fast rejection functionality was merged and the readme for mail-receiver talks about using BLACKLISTED_SENDER_DOMAINS which is part of fast rejection, however fast rejection is disabled by default.

https://github.com/discourse/discourse_docker/pull/344/files

The above change was never merged, citing a need to investigate backwards compatibility and with the absence of DISCOURSE_BASE_URL, fast rejection is disabled by the following line.

The mail receiver app is written such that it handles either DISCOURSE_BASE_URL or DISCOURSE_MAIL_ENDPOINT as shown below and the readme doesn’t mention the latter, so I think it would be safe to apply the change from that pull request, using https rather than http to reflect a more recent change to the sample.

The change as proposed by the pull request will mean that fast rejection is enabled by default for new operators of mail-receiver, that is operators who copy the sample as a starting point.

Existing operators will continue to have it disabled until the yml is manually updated to use DISCOURSE_BASE_URL. If my assertions here are accurate, perhaps something about this should be posted to #feature:announcements after the sample is updated.

It would also be straightforward to do something very similar to the handle_mail endpoint for the fast rejection endpoint, that is infer the endpoint from DISCOURSE_MAIL_ENDPOINT if DISCOURSE_BASE_URL doesn’t exist.

Fast rejection can then be enabled by default for both new and existing operators. I would be happy to make a pull request for this change if enabling it for everyone (who rebuilds mail-receiver after the change) is desirable.

2 Likes