We’ve noticed on a couple forums that use Discourse to mirror a public mailing list some posts are getting attributed to the wrong user:
from: [ruby-talk:444110] exif - photo metadata - ruby-talk - Ruby Mailing List Mirror
In this case, Discourse first staged a user with the name “Austin Ziegler via ruby-talk” with an email address matching the list submission address and that’s what shows up for every post like this.
from: txt.att.net outage? - #4 by Mailman - Mailman List Mirror (Read Only) - NANOG
In this case, Discourse first staged a user with the name “Mailman” with an email address matching the list submission address.
Upon investigation, our mail parsing is sometimes incorrect. The cause is that for DMARC compliance, Mailman will sometimes change the From header to itself and put the original sender into the reply-to:
To: Ryan Davis via ruby-talk
X-MailFrom: tom@tomsdomain.com
X-Mailman-Version: 3.3.3
Reply-To: Ruby users <ruby-talk@ml.ruby-lang.org>
From: Tom Reilly via ruby-talk <ruby-talk@ml.ruby-lang.org>
Cc: Tom Reilly <tom@tomsdomain.com>
To: Jared Mauch <jared@jaredsdomain.com>
X-BeenThere: nanog@nanog.org
X-Mailman-Version: 2.1.39
From: Owen DeLong via NANOG <nanog@nanog.org>
Reply-To: Owen DeLong <owen@owensdomain.com>
Cc: nanog <nanog@nanog.org>
but leave it when it doesn’t need to change:
To: Jon Lewis <jlewis@jonsdomain.org>
X-BeenThere: nanog@nanog.org
X-Mailman-Version: 2.1.39
From: William Herrin <bill@billsdomain.us>
Cc: nanog@nanog.org
Seems there’s a lot of different options for behaviour here, so we’d like to come up with an algorithm to properly parse what Mailman sends out in every single case.
There’s potentially other options, for instance Mailman could post the unchanged message directly to a Discourse instance, but those are more complex to set up and may not be available to everyone.
Here’s the start of one:
- if mailman-version < 3
- if any of:
- From address matches List-Id
- From address matches List-Post
- From address matches X-BeenThere
- then use Reply-To as From
- if any of:
- if mailman-version >= 3
- if X-MailFrom exists
- Use name from From header, stripping
/via .*/
- Use email from X-MailFrom
- Use name from From header, stripping
- if X-MailFrom exists
Also, when all this is said and done, is it possible to have a rake task re-process existing posts (probably only the ones matching the erroneous user) with this new logic?