Reducing backscatter in email interface?

This is a bit of a braindump, under the assumption there isn’t a “this already works if you click this checkbox” solution out there.

I’m experimenting with Discourse and its Mailing List Mode (which are both pretty great!), and I was thinking about some minor improvements with email backscatter.

I’m using the mail-receiver Docker container that Discourse offers, not POP3 polling. What I’m going to describe here only applies to handling one’s own incoming SMTP traffic.

The gist is this: Discourse is smart enough to reject incoming mail if the user isn’t allowed to post somewhere, or if the user wrote to a bogus email address. It sends a polite bounce email, but this has unintended consequences:

  • Spammers hitting the mail server will generate bounces, either for bogus addresses or real people that didn’t actually try to interact with Discourse, so you can get a flurry of backscatter, which is bad.
  • Legit email from legit users that made legit mistakes will get a legit bounce, and GMail will quietly put it in the user’s spam folder (at least, it did for me).

A better solution would be to reject mail during the SMTP transaction, when reasonable to do so. Making that work will:

  • Reduce unnecessary email bounce traffic due to joe-job spammers, etc.
  • Prevent risk of damage to a domain’s spam reputation.
  • Give users a more visible and immediate notice that something went wrong (email providers will tend to put a THIS DIDN’T SEND email in the user’s inbox).

Even if we just use the mail-receiver Docker container that Discourse offers, this still needs some support from the Discourse instance to make this happen. I couldn’t find any API to do this, but it would be useful if we could query with the system API key for something like this:

  • Can a@b.c send mail to x@y.z?

And this returns a boolean (yes or no on whether mail is acceptable) and maybe a simple string of why ("x@y.z isn’t a valid reply-to-thread slug, x@y.z isn’t a create-new-topic address, a@b.c isn’t allowed to post here, etc. It could even just be the existing “Email::Receiver::StrangersNotAllowedError” style error strings).

This could be more fine-grained (“is a@b.c a valid user?” “is x@y.z an available email target?”), but ultimately we want an API that says “should I drop this email right now or send it through to the system?” …it doesn’t have to, and probably shouldn’t, deal with rejections for things that need further processing (email is too short rejections, etc). We really just need to catch the obvious spam and malware vectors here.

Then that API call can be hooked up to /usr/local/bin/receive-mail in the mail-receiver Docker container, which can optionally return a permanent fail to the SMTP client instead of sending the email content to Discourse at all.

Does this sound like a reasonable approach? Does something like this exist already?

9 лайков

Nothing like that exists already. It wouldn’t be impossible to add, but it wouldn’t be trivial, either. Postfix has some excellent hooks for allowing SMTP-time checks to occur, we’re just not using them because mail-receiver was very much a “simplest thing that could possibly work” implementation.

If you’d like to work on this, I’m happy to provide guidance, code review, and testing. I’d like to see it happen, it’s just not something that I think anyone here at CDCK World HQ will have any bandwidth to work on any time soon.

9 лайков

I’ve posted a PR for the new API inside of Discourse here.

If I’m understanding Postfix correctly, we’d need something that talks over the Milter protocol or SMTP to properly reject email (it’s too late when /usr/local/bin/receive-mail runs). That seemed non-trivial, at least for the moment, so I’ll wait for feedback on the PR before worrying about that part.

3 лайка

I’ll leave the core PR to those more experienced in the app, but for the Postfix integration, yes, you need something that will talk milter, SMTP or Postfix policy protocol to Postfix and HTTP to Discourse, to translate. I’d strongly recommend going the Postfix policy route, as the protocol is easy to parse, and you can avoid the need to manage spawning your own daemons by having the Postfix master process do it for you.

4 лайка

or Postfix policy protocol

Oh wow, this policy protocol thing is so much less painful! Thanks for mentioning that. :slight_smile:

Ok, I wrote a script to handle the SMTP portions of all this magic.

First, you’ll need the patch in the PR applied, and a sv restart unicorn or reboot or whatever to pick up the changes.

Then get this script:

discourse-smtp-fast-rejection.rb.gz (1.3 KB)

In the mail-receiver Docker container, you’ll want to put that file at /usr/local/bin/discourse-smtp-fast-rejection.rb, perhaps with a less verbose name if you want.

/etc/postfix/mail-receiver-environment.json in the mail-receiver Docker container needs a DISCOURSE_SMTP_SHOULD_REJECT_ENDPOINT entry, or just hardcode it in the script appropriately. It should look like this:

https://discourse.example.com/admin/email/smtp_should_reject.json

I don’t know how/where that JSON file gets generated. I ended up hardcoding it in my copy for now.

Add this to the end of /etc/postfix/master.cf

policy  unix  -       n       n       -       -       spawn
    user=nobody argv=/usr/local/bin/discourse-smtp-fast-rejection.rb

…and this to the end of /etc/postfix/main.cf

smtpd_recipient_restrictions = check_policy_service unix:private/policy

And run postfix reload inside the container, or just reboot or whatever.

Note that there’s a bug in my PR about reply-to addresses, but once that’s sorted out, this work might be complete. The mail-receiver container’s SMTP server is rejecting email appropriately before sending it on to Discourse. :slight_smile:

4 лайка

A PR against mail-receiver would be greatly appreciated, once the core changes have gone in.

6 лайков

Okay, this is all sitting in various PRs now. You’ve probably gotten 12 emails with the links at this point, :), but here they are one more time for the general public:

4 лайка

This looks pretty slick, nice work.

There is mention of testing.. I’ve got a new discourse instance that’s up and running now, but not scheduled to be in production for a week or so. I can offer it for real-world testing (and can grant access as needed).

As for the setup of this, looking at the new variable in the mail-receiver, I wonder:

If DISCOURSE_SMTP_SHOULD_REJECT_ENDPOINT is going to be optional, should that be commented out by default?

If it should be on by default, maybe there should be just one variable to set, like

DISCOURSE_ADMIN_URL

I’m suggesting admin URL as it would have to take in to account any subfolder. Using this, the paths to /email/handle_mail and email/smtp_should_reject.json could be implied by the mail-receiver itself.

2 лайка

I was pondering that myself; rather than having to list a whole bunch of URLs all over the place, when the internal structure is known and fixed. @icculus, if you’re feeling frisky that would be a nice change to make, although in the interests of getting things merged I wouldn’t consider it a blocker, it can always be done in the future.

3 лайка

This is a good idea, I’ll fix this up in the morning.

I’ve got a live instance running this already, and I’m trying to decide if there’s any reason I can’t just blow away my existing mail-receiver and build a new one to make sure I didn’t screw up the new parts of the Docker container configuration.

@mpalmer, I assume this line in mail-receiver.yml pulls in the contents of GitHub - discourse/mail-receiver from somewhere?

base_image: discourse/mail-receiver:1.0.0

Is there an easy way to override that, so it picks up my changes? Or is that literally just cloning from the github repo when it sees that?

1 лайк

That’s a reference to the docker hub image that comes pre-built. You can build your own image based on your changes, by running docker build -t my-funky-mail-receiver . in the mail-receiver repo, and then change base_image to my-funky-mail-receiver in the mail-receiver.yml, and everything will Just Work.

1 лайк

If it should be on by default, maybe there should be just one variable to set, like
DISCOURSE_ADMIN_URL

I went further and changed it to DISCOURSE_BASE_URL; we can tack on the “admin” part too. :slight_smile:

The PRs are updated with that now.

3 лайка

change base_image to my-funky-mail-receiver in the mail-receiver.yml, and everything will Just Work.

This worked like a champ, thank you; discourse.libsdl.org is up and running with a freshly built mail-receiver container, so we can say with some confidence that the latest PR is safe to merge.

You can send email to any address @discourse.libsdl.org to get a rejection to see it working on your end. :slight_smile:

4 лайка

Should I work on this, or do you have it @icculus?

2 лайка

If you have time now, go for it!

Okay – just lemme know if you do before I do…I might do it if I get bored…maybe…

Привет, оживляю старую тему, потому что какая-то версия патча для получателя почты, который так и не был применён, всё же попала в систему контроля версий. Со временем, по мере устаревания кода, мой модифицированный контейнер перестал работать.

Я удалил свой контейнер получателя почты, создал новый контейнер и новый API-ключ. Входящая почта на моём экземпляре Discourse снова работает… но система, кажется, снова рассылает отказы в получении писем, как в 2017 году.

Действительно, оба ответа об отказе вернулись в мой почтовый ящик, хотя письмо с ошибкой BadDestinationAddress должно было быть отклонено на уровне SMTP ещё до передачи в Discourse для дальнейшей обработки и отправки ответа. Если спамер отправит на этот сервер письмо с несуществующим адресом, это приведёт к генерации backscatter.

При прямом обращении к SMTP-серверу я вижу, что он не пытается отклонять поддельные письма.

root@discourse:/var/discourse# telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 ESMTP server
HELO sdfsdfsdf
250 discourse-mail-receiver.localdomain
MAIL FROM: sdfsdf@example.com
250 2.1.0 Ok
RCPT TO: sdfsdfsdf@discourse.libsdl.org
250 2.1.5 Ok

…всё это к тому, что я вижу: в стандартном образе получателя почты подключён скрипт быстрого отклонения, но, похоже, он ничего не отклоняет…?

/etc/postfix/master.cf:

policy     unix  -       n       n       -       -       spawn user=nobody
    argv=/usr/local/bin/discourse-smtp-fast-rejection

/etc/postfix/main.cf:

smtpd_recipient_restrictions = check_policy_service unix:private/policy

Нужно ли мне что-то настроить, чтобы это заработало, или есть способ глубже разобраться, почему это не работает? У других людей это работает?

Спасибо!

1 лайк

Поднимаю… У кого-то ещё есть эта проблема?

Я тоже наблюдаю эту проблему, которую заметил, когда недавний спам использовал адрес для ответов (replies+abc123@instance) в поле «От кого», что вызвало у Discourse создание серии автоматических ответов самому себе.

Судя по всему, весь функционал быстрого отклонения был слит, и в README для mail-receiver упоминается использование BLACKLISTED_SENDER_DOMAINS, который является частью быстрого отклонения, однако быстрое отклонение отключено по умолчанию.

Вышеупомянутое изменение так и не было слито, ссылаясь на необходимость проверить обратную совместимость, и из-за отсутствия DISCOURSE_BASE_URL быстрое отклонение отключено следующей строкой:

Приложение mail receiver написано так, что оно обрабатывает либо DISCOURSE_BASE_URL, либо DISCOURSE_MAIL_ENDPOINT, как показано ниже, и в README не упоминается последний, поэтому я считаю, что будет безопасно применить изменение из того pull request, используя https вместо http, чтобы отразить более недавнее изменение в образце.

Изменение, предложенное в pull request, означает, что быстрое отклонение будет включено по умолчанию для новых операторов mail-receiver, то есть для тех, кто копирует образец в качестве отправной точки.

Существующие операторы продолжат иметь его отключенным, пока файл yml не будет вручную обновлен для использования DISCOURSE_BASE_URL. Если мои утверждения здесь точны, возможно, об этом стоит написать в #feature:announcements после обновления образца.

Также было бы несложно сделать что-то очень похожее на endpoint handle_mail для endpoint быстрого отклонения, то есть определять endpoint из DISCOURSE_MAIL_ENDPOINT, если DISCOURSE_BASE_URL отсутствует.

Тогда быстрое отклонение можно будет включить по умолчанию как для новых, так и для существующих операторов. Я с радостью подготовлю pull request для этого изменения, если включение его для всех (кто пересоберет mail-receiver после изменения) является желательным.

2 лайка