Discourse email messages are incorrectly threaded

cameron-simpson · July 21, 2022, 6:50am

Over at discuss python org we’re discussing the email side of Discourse. The biggest gripe the lack of threading. I did a bit of digging in the headers and it seems that:

the Message-ID header is at least unique
the Reply-To and References headers do not refer to the Message-IDs of other messages, let alone to the message id of the message to which they are a reply
they instead refer to some fictitious message id based on the topic number

This means that people using email see (a) totally flat unthreaded discussions and (b) the root message is apparently missing, because the In-Reply-To and References headers refer to a message id which never actually appears on any message.

This is bad, and in violation of RFC 5322. And it makes the email experience far poorly than it could easily be.

As an example, there’s a thread over there whose first message has these headers:

Message-ID: <topic/17208.dc83577b18fc3ecc438ed42a@discuss.python.org>
References: <topic/17208@discuss.python.org>

It is the first message. It should not have a References header, because there’s no message anywhere with that id.

The second message has these headers:

Message-ID: <topic/17208/60568.898edf234f56cf6f3a661c1a@discuss.python.org>
In-Reply-To: <topic/17208@discuss.python.org>
References: <topic/17208@discuss.python.org>

Again, an ok Message-ID, but completely nonsensical In-Reply-To and References.

This should be easy to fix. The first message should have neither In-Reply-To nor References headers. The second message should have the first message’s Message-ID in the In-Reply-To and References headers.

Please see RFC5322 section 3.6.4 for specifics:
https://tools.ietf.org/rfcmarkup/5322#section-3.6.4

As things are, email users see flat unstructured discussions. With these fixes, they can have sensible easy to follow threaded display.

ericsnowcurrently · July 21, 2022, 3:26pm

In case anyone is interested, the archive of the discussion to which Cameron is referring is found at https://mail.python.org/archives/list/python-dev@python.org/message/VHFLDK43DSSLHACT67X4QA3UZU73WYYJ/.

RGJ · July 21, 2022, 3:59pm

That seems to be a regression, see this old topic and the fix.

cameron-simpson · July 21, 2022, 10:50pm

Just having a look at the diff between HEAD and that fix.

It seems to me that current still always sets References, even if there’s no antecedant - the topic_canonical_reference_id is used as a fallback. I still think that’s wrong, because there is no email message with that id.

The In-Reply-To is a little more correct, in that it is only set if post.post_number!=1, but it still falls back to topic_canonical_reference_id:

@message.header['In-Reply-To'] = referenced_post_message_ids[0] || topic_canonical_reference_id

This seems to have 2 problems to my eye:

the fallback should be the Message-ID of post #1 if there are no referenced_post_message_ids, and not topic_canonical_reference_id
something in the receipt-of-reply-emails code must be dropping the In-Reply-To header of the reply messages, because they should have correctly populated the referenced_post_message_ids array (“list”? I’m new to Ruby)

martin · July 22, 2022, 12:25am

Cameron, thanks for opening up this topic for discussion and providing a lot of detail in your posts. I am responsible for this can of worms, from these two commits:

https://github.com/discourse/discourse/commit/3b13f1146b2a406238c50d6b45bc9aa721094f46

https://github.com/discourse/discourse/commit/82cb67e67b83c444f068fd6b3006d8396803454f

We have been aware of some issues around threading for a little while now in email clients such as Thunderbird but it has not represented a large number of consumers of email threading from Discourse so it’s been punted on, but now this is coming to light we need to spend some time reexamining the issue and working on a fix.

Interestingly, we added this References header to the first sent email and every subsequent one at the time since it makes threading work correctly in Gmail, but I agree it’s not ideal and is likely causing the threading issues along with not using the original Message-ID in the subsequent email In-Reply-To and References headers.

Please bear with me as I look through old discussions and the code and work through this. In the meantime, are you aware of other email clients that are being used and are experiencing issues? For example I know that this is an issue in Thunderbird, but what about any others? Thanks.

cameron-simpson · July 22, 2022, 3:27am

Wrote a long reply, but got:

We're sorry, but your email message to 
["incoming+8349bd9eb1f2b582df4f32dbe85c3363@meta.discoursemail.com"] 
(titled Re: [Discourse Meta] [bug] Discourse email messages are
incorrectly threaded) didn't work.

Reason:
Sorry, new users can only put 2 links in a post.
If you can correct the problem, please try again.

I’ll go put in in in the forum where I can catch and revise…

cameron-simpson · July 22, 2022, 3:34am

Cameron, thanks for opening up this topic for discussion and providing
a lot of detail in your posts. I am responsible for this can of worms,
from these two commits:

3b13f1146b2a406238c50d6b45bc9aa721094f46

This looks fine. Does it save this id with the db record so that inbound
replies can be tied to the antecedant forum message?

Also, do you want me to vet that the suffix is syntacly legal for
RFC5322, in terms of permitted characters?

82cb67e67b83c444f068fd6b3006d8396803454f

This second commit seems to address another problem we have seen: if a
post comes from an email, the outbound message-id sent to email users is
not the message-id of the source message from the author. This results
in two different messages from the point of view of a mail client, and
probably breaks replies made to the original as opposed to the
forum-sent copy. For example:

To: the forum
CC: one of the participants

The partitpant will (well, may) receive a copy from the forum and a
direct copy from the author, and these will be distinct messages at
their end because they will have different message-ids.

I was going to make a second bug report about this issue after sorting
the in-reply-to and references headers issue, which is far more
important.

We have been aware of some issues around threading for a little while now in email clients such as Thunderbird but it has not represented a large number of consumers of
email threading from Discourse so it’s been punted on, but now this is coming to light we need to spend some time reexamining the issue and working on a fix.

I and several others use mutt. I’m happy to do whatever is needed to aid
in debugging this and reviewing code. I’ve also been a mail sysadmin for
yonks in former lives.

[quote=“Cameron Simpson, post:1, topic:233499,
username:cameron-simpson”]
It is the first message. It should not have a References header, because there’s no message anywhere with that id.
[/quote]

Interestingly, we added this References header to the first sent email and every subsequent one at the time since it makes threading work correctly in Gmail,

I think a correct References header (absent in the first post, like
in-reply-to in replies) should also work. But GMail has a rather loose
relationship with mail standards at times. I have a gmail accord; I can
do some debugging there too. And in principle we can use this very
discussion as the test bed, maybe.

but I agree it’s not ideal and is likely causing the threading issues
along with not using the original Message-ID in the subsequent email
In-Reply-To and References headers.

Please bear with me as I look through old discussions and the code and work through this.

No worries.

In the meantime, are you aware of other email clients that are being
used and are experiencing issues? For example I know that this is an
issue in Thunderbird, but what about any others? Thanks.

Definitely mutt. At least with mutt is it very easy to see the headers
and also to see the reply tree chain, which is often obscured in other
clients.

Mail threading is entirely defined by the Message-ID and In-Reply-To
headers. The References header started with USENET for followups, and
supported (there) multiple message-ids; the In-Reply-To supports just
one. It looks like References is now also present in RFC5322, and I’ll
check into its semantics.

martin · July 22, 2022, 4:41am

I am just collecting my thoughts in a big post about this for later today, thank you for the extra information so far!

martin · July 22, 2022, 6:24am

Okay this is kind of huge, please bear with me. First, thanks for another detailed reply and the offer of debugging / review, it is really helpful I’ve actually been looking into this this morning and, surprisingly, the threading in a unified view works in Thunderbird for most cases, and I think the References header consistently pointing to the OP helps with that (for example the topic Reference in this chain which is always present is <topic/53@discoursehosted.martin-brennan.com>.

The case where the threading does not working as intended is:

A post is created within discourse and an email is sent out to those watching the topic then
Someone else replies to that post and an email is sent out to those watching the topic

In the case of the second email, it gets an incorrect In-Reply-To and References header since it generates one on this line discourse/lib/email/sender.rb at 98bacbd2c6b9fe57167cd32af5eb4839b4a5d1f6 · discourse/discourse · GitHub rather than using an existing one. It should be using the Message-ID for the email that was sent first. In the screenshot, this is where the messages following this pattern should be placed:

The answer is – it depends. If a post is created in Discourse from an inbound email, such as this one of yours, we use that post’s original inbound Message-ID when someone replies to it for the In-Reply-To and References headers as per:

github.com

discourse/discourse/blob/98bacbd2c6b9fe57167cd32af5eb4839b4a5d1f6/lib/email/sender.rb#L146-L147


      
          if referenced_post.incoming_email&.message_id.present?
            "<#{referenced_post.incoming_email.message_id}>"

Otherwise we are just using the topic OP reference and just generating a new reference, which obviously is what is causing all the issues. In all cases we generate a new Message-ID every time an outbound email is sent, which seems correct and on par with other mail clients.

I think I see what you mean, does it go like this:

cameron sends email to Discourse from mutt which gets Message-ID: 74398756983476983@mail.com
Discourse creates a post and stores the Message-ID with against the post with an IncomingEmail record
johndoe is watching the topic, so they get sent an email from Discourse with a Message-ID: topic/222/44@discourse.com and no reference to the original Message-ID: 74398756983476983@mail.com

Does that sound correct, that we should just “pass on” that Message-ID to those watching the topic instead of generating our own since it’s already unique? What then happens in johndoe’s mail client if
cameron also CC’d him on that original outbound message? This does sound like a separate issue so it would be good to open another bug topic for it.

I will set up a mutt client locally to see what you are also seeing, I have never tested this functionality in a text-based client (only Gmail and Thunderbird) so I am keen to see how it looks anyway.

My line of thinking to address these issues this morning was to dispose with the randomly generated suffixes generated when we send Message-ID headers in emails and instead change to a scheme where we use the user_id of both the sending and receiving user. The benefit of this is that there is no need to store the Message-ID anywhere (apart from when an inbound email creates a post) and so References and In-Reply-To headers will always be consistent. Let me give an example. Say we have these users:

martin - user_id 25
cameron - user_id 44
sam - user_id 78
bob - user_id 999

And then we have this topic, topic_id 233499, with posts starting from post_id 100 as the OP. The format would become topic/#{topic_id}/#{post_id}.s#{sender_user_id}r#{receiver_user_id}. The order of operations would look like this:

martin creates the OP

cameron is sent an email with these headers:
- Message-ID: topic/233499.s25r44@meta.discourse.org
- References: topic/233499@meta.discourse.org
sam is sent an email with these headers:
- Message-ID: topic/233499.s25r78@meta.discourse.org
- References: topic/233499@meta.discourse.org

cameron replies via email

discourse is sent an email with these headers from mutt:
- Message-ID: 43585349859734@test.com
- References: topic/233499@meta.discourse.org topic/233499.s25r44@meta.discourse.org
- In-Reply-To: topic/233499.s25r44@meta.discourse.org

discourse (as cameron, from the above email) creates post 101

sam is sent an email from discourse with these headers:
- Message-ID: topic/233499/101.s44r78@meta.discourse.org
- References: 43585349859734@test.com topic/233499@meta.discourse.org
- In-Reply-To: 43585349859734@test.com

sam replies via email to cameron

discourse is sent an email with these headers from gmail:
- Message-ID: 5346564746574@gmail.com
- References: topic/233499/101.s44r78@meta.discourse.org topic/233499@meta.discourse.org
- In-Reply-To: topic/233499/101.s44r78@meta.discourse.org

discourse (as sam, from the above email) creates post 102

cameron is sent an email from discourse with these headers:
- Message-ID: topic/233499/102.s78r44@meta.discourse.org
- References: 5346564746574@gmail.com topic/233499@meta.discourse.org
- In-Reply-To: 5346564746574@gmail.com

bob creates post 103 in the topic, not in reply to anyone (note that the References here includes the Message-ID sent to both users for the OP email)

cameron is sent an email with these headers:
- Message-ID: topic/233499/103.s999r44@meta.discourse.org
- References: topic/233500@meta.discourse.org topic/23499.s25r44@meta.discourse.org
sam is sent an email with these headers:
- Message-ID: topic/233499/103.s999r78@meta.discourse.org
- References: topic/233499@meta.discourse.org topic/23499.s25r78@meta.discourse.org

cameron replies via email

discourse is sent an email with these headers from mutt:
- Message-ID: 6759850728742572@test.com
- References: topic/233499@meta.discourse.org topic/233499/103.s999r44@meta.discourse.org
- In-Reply-To: topic/233499/103.s999r44@meta.discourse.org

cameron’s inbox

martin - topic OP
- SENT → to: discourse, RE: topic OP
  - sam - reply to second post
- bob - reply in topic not to any particular post
  - SENT → to: discourse, RE: bob’s post

sam’s inbox

martin - topic OP
- cameron - second post
  - SENT → to: discourse, RE: second post
- bob - reply in topic not to any particular post

I think this is correct, can you just take a look over what I have written in these headers and verify that is what you would expect from this scenario? The only thing I am a little unsure about is whether I have covered all the References, and of course I would be testing this on a live set of emails in a dev branch before rolling it out. I have not tested anything in mutt yet either.

As a side note, I also looked into what GitHub do with their notification emails, and noticed they do a similar thing where they have an ever-present Reference (discourse/discourse/pull/252@github.com) that is used in all the emails related to that “topic” which in this case is a GitHub pull request:

References: <discourse/discourse/pull/252@github.com> <discourse/discourse/pull/252/issue_event/7042100517@github.com>
In-Reply-To: <discourse/discourse/pull/252/issue_event/7042100517@github.com>

cameron-simpson · July 23, 2022, 12:44am

By Martin Brennan via Discourse Meta at 22Jul2022 06:34:

Okay this is kind of huge, please bear with me. First, thanks for
another detailed reply and the offer of debugging / review, it is
really helpful I’ve actually been looking into this this morning
and, surprisingly, the threading in a unified view works in Thunderbird
for most cases, and I think the References header consistently
pointing to the OP helps with that (for example the topic Reference
in this chain which is always present is
<topic/53@discoursehosted.martin-brennan.com>.

I’ve just reread RFC5322 section 3.6.4 closely. It has moved on from
earlier versions (822 and 2822), and has merged the email In-Reply-To
headers, USENET References headers and modern
reply-citing-more-that-one previous messages.

The short summary:

The Message-ID is a single persisent identifier for a message
The In-Reply-To contains all the message-ids of which this message
is a direct reply, so if I reply to a pair of messages it will have
those 2 message-ids
The References is a reply chain of antecedant message-ids from the
OP to the preceeding message. So indeed it should always start with
the OP message-id.

So for a discussions like this, pretending that labels are message-ids:

OP
  -> reply1
    -> reply2 ---+
  -> reply3      |
    -> reply4    |
      -> reply5 <+

The reply5 would have:

message-id=reply5
in-reply-to=“reply2 reply4”
references=“OP reply3 reply4”

It is also leagel to include “reply1 reply2” in the references (the
other chain to reply5) but the RFC explicitly recommends against that
becaause some clients expect the references to be a single linear chain
of replies, not some flattened digraph.

So my recommendation for constructing the references is to use the
references of the “primary” antecedant message with the primary
antecedant message’s message-id appended. That way you always get a
linear chain in the correct order.

image3042×492 188 KB

Interestingly there seems to be some threading there.

But notice: the top post has a little “is a reply” arrow. Even though it
is post 1. I expect that is because of the “topic” references entry,
which make TB think there was a earlier message (which of course there
was not).

In mutt-land we see almost no threading at all:

23Jul2022 06:24 Olha via Discus - ┌>[Py] [Users] I need an advise  discuss-users 5.7K
22Jul2022 17:12 Paul Jurczak vi - ├>[Py] [Users] I need an advise  discuss-users 5.5K
22Jul2022 13:21 Rob via Discuss - ├>[Py] [Users] I need an advise  discuss-users 6.8K
22Jul2022 12:53 vasi-h via Disc - ├>[Py] [Users] I need an advise  discuss-users 5.5K
22Jul2022 11:38 Cameron Simpson - ├>[Py] [Users] I need an advise  discuss-users  14K
22Jul2022 10:27 Rob via Discuss - ├>[Py] [Users] I need an advise  discuss-users 6.6K
22Jul2022 06:14 vasi-h via Disc r ┴>[Py] [Users] I need an advise  discuss-users 6.5K

which is because every message’s In-Reply-To points directly at the
fictitious “topic” message-id. Mutt probably ignores the References
because it is a mail reader, and References originates in USENET news.
Maybe Thunderbird is using the references or augumenting the in-reply-to
with references information.

You only need to consult one of In=-Reply-To or References to do
threading; the former comes from email and the latter from USENET.
You’re supporting both (which is great!) so we need to make them
consistent.

(Aside: there’s also discussion about USENET mirroring, because several
python people consume the lists via a USENET interface. Again, a
separate topic.)

[…]

[quote=“Cameron Simpson, post:8, topic:233499,
username:cameron-simpson”]
This looks fine. Does it save this id with the db record so that inbound
replies can be tied to the antecedant forum message?
[/quote]

The answer is – it depends. If a post is created in Discourse from an inbound email, such as this one of yours, we use that post’s original inbound Message-ID when someone replies to it for the In-Reply-To and References headers as per:

discourse/lib/email/sender.rb at 98bacbd2c6b9fe57167cd32af5eb4839b4a5d1f6 · discourse/discourse · GitHub

Otherwise we are just using the topic OP reference and just generating a new reference, which obviously is what is causing all the issues. In all cases we generate a new Message-ID every time an outbound email is sent, which seems correct and on par with other mail clients.

Alas, not quite. If you’re the origin of the message (i.e. authored in
Discourse), generating the message-id is fine. If there’s no message-id
(illegal) generating one is standard practice (usually by MTAs). But if
you’re passing a message on (authored in email), the existing message-id
should be preserved.

To my mind you need to be doing 3 things:

having a stable message-id and not replacing the message-id from an
inbound message
generating correct In-Reply-To, which is easily computed from the
immediate antecedant message(s) i.e. antecedant(s)-Message-ID
generating correct References, which is easily computed as
antecedant-References + antecedant-Message-ID

For point 1, looking at the code you cite, you probably want the email
message id to be (Pythonish syntax, sorry):

def message_id(post):
    return post.incoming_email.message_id or discourse_message_id(post)

i.e. to be the post’s email message-id if it originated from email,
otherwise the Discourse message-id using something like the algorithm
you outline later in this message: anything (a) stable and (b)
syntacticly valid.

Then computing the In-Reply-To and References fields is simple
mechanical stuff as in points 2 and 3.

Cameron Simpson:

This second commit seems to address another problem we have seen: if a
post comes from an email, the outbound message-id sent to email users is
not the message-id of the source message from the author. This results
in two different messages from the point of view of a mail client, and
probably breaks replies made to the original as opposed to the
forum-sent copy.

I think I see what you mean, does it go like this:

cameron sends email to Discourse from mutt which gets Message-ID: 74398756983476983@mail.com

Discourse creates a post and stores the Message-ID with against the post with an IncomingEmail record

Correct.

johndoe is watching the topic, so they get sent an email from Discourse with a Message-ID: topic/222/44@discourse.com and no reference to the original Message-ID: 74398756983476983@mail.com

No. You really want to pass through IncomingEmail.message_id as the
Message-ID in the email to johndoe. It’s the same message.

Does that sound correct, that we should just “pass on” that Message-ID to those watching the topic instead of generating our own since it’s already unique? What then happens in johndoe’s mail client if
cameron also CC’d him on that original outbound message? This does sound like a separate issue so it would be good to open another bug topic for it.

By passing it on, the original message (cameron->cc:johndoe) and the
Discourse forwarded message (cameron->Discourse->johndoe) have the same
message-id and the same message contents. The receiving mail system
stores both. The mail reader sees both, and either presents both or
keeps just one (this is a policy decision of the mail reader - keeping
just one is common). Because they’re the same message, in general it
does not matter which is kept.

If we ignored discourse and considered a message which was
a copy of the message via the list and also via direct email. They’re
the same message, with the same message-id.

Cameron Simpson:

I and several others use mutt.

I will set up a mutt client locally to see what you are also seeing, I have never tested this functionality in a text-based client (only Gmail and Thunderbird) so I am keen to see how it looks anyway.

Happy to help with settings. For threaded view you need to set the
sorting to threadeed. Mutt is very configurable.

My line of thinking to address these issues this morning was to dispose
with the randomly generated suffixes generated when we send
Message-ID headers in emails and instead change to a scheme where we
use the user_id of both the sending and receiving user. The benefit
of this is that there is no need to store the Message-ID anywhere
(apart from when an inbound email creates a post) and so References
and In-Reply-To headers will always be consistent.

Yes, that is much better. Noting that the inbound email message-id
should override the Discourse derived message-id for the outbound email.

(Most mail systems use random strings because there’s no surrounding
context such as the discourse topic message structure - messages are
considered alone; but the only real requirement is persistent
uniqueness.)

Let me give an example. Say we have these users:

martin - user_id 25

cameron - user_id 44

sam - user_id 78

bob - user_id 999

And then we have this topic, topic_id 233499, with posts starting from post_id 100 as the OP. The format would become topic/#{topic_id}/#{post_id}.s#{sender_user_id}r#{receiver_user_id}.

The order of operations would look like this:

martin creates the OP

cameron is sent an email with these headers:

Message-ID: topic/233499.s25r44@meta.discourse.org

References: topic/233499@meta.discourse.org

sam is sent an email with these headers:

Message-ID: topic/233499.s25r78@meta.discourse.org

References: topic/233499@meta.discourse.org

There should not be a References header in the OP. It isn’t
needed for threading and effectively pretends there’s some “post 0”
which doesn’t exist. It meeans every OP (a) looks like a reply, which it
is not and (b) looks like the thing to which it is a reply is missing
from the reader’s mailbox.
This makes different message-ids for each outbound copy of the OP.
That’s bad. They need to be the same. Supposing sam CCs cameron
directly in a reply. The In-Reply-To will cite a mesage-id cameron
has never received.

You can just drop the sender_user_id and receiver_user_id from the
message-id field and get a single unique id which every receiver sees.

The uniqueness constraint is the post itself, not the individual
email-level “message” object.

Re the References, the OP should not have one. TB and everything else
will be fine. If they’re threading using References instead of
In-Reply-To, the References in the reply messages are enough.

Here’s the start of a mailing list discussion thread in Mutt:

16Jul2022 01:09 Rob Boehne      - │├>[Python-Dev] Re: [SPAM] Re: Swit python-dev 9.2K
16Jul2022 01:33 Peter Wang      - │├>                                 python-dev 3.0K
16Jul2022 00:24 Skip Montanaro  - ├>[Python-Dev] Re: Switching to Dis python-dev 4.2K
16Jul2022 04:49 Erlend Egeberg  - ├>[Python-Dev] Re: Switching to Dis python-dev  10K
16Jul2022 04:20 Mariatta        - ├>[Python-Dev] Re: Switching to Dis python-dev  10K
15Jul2022 21:18 Petr Viktorin   - [Python-Dev] Switching to Discourse python-dev 4.2K

Ignore that I sort my email newest-on-top. See that there’s no arrow on
the initial post (at the bottom). That messgae has no References and
no In-Reply-To. All the others have In-Reply-To (and possibly
References, but this is an email mailing list so not necessarily; as I
mentioned before they’re complimentary.)

If I repeat my Discourse example from earlier:

23Jul2022 06:24 Olha via Discus - ┌>[Py] [Users] I need an advise  discuss-users 5.7K
22Jul2022 17:12 Paul Jurczak vi - ├>[Py] [Users] I need an advise  discuss-users 5.5K
22Jul2022 13:21 Rob via Discuss - ├>[Py] [Users] I need an advise  discuss-users 6.8K
22Jul2022 12:53 vasi-h via Disc - ├>[Py] [Users] I need an advise  discuss-users 5.5K
22Jul2022 11:38 Cameron Simpson - ├>[Py] [Users] I need an advise  discuss-users  14K
22Jul2022 10:27 Rob via Discuss - ├>[Py] [Users] I need an advise  discuss-users 6.6K
22Jul2022 06:14 vasi-h via Disc r ┴>[Py] [Users] I need an advise  discuss-users 6.5K

See they all have a leading arrow? That is because the mail client
believes they are all replies to a common (and missing) root message,
which is because of the “topic” message-id in the References header.
Whereas post 1 is actually the bottom message displayed above.

Summary:

your plan is good, provided you drop the sender and receiver from the
message-id - they’re unnecessary and in fact the receiver will cause
trouble (the sender is just redundant).
drop the “topic” pseudo-message-id from the References - it misleads
email clients (including TB, even if it isn’t visually evident)

cameron replies via email

discourse is sent an email with these headers from mutt:

Message-ID: 43585349859734@test.com

References: topic/233499@meta.discourse.org topic/233499.s25r44@meta.discourse.org

In-Reply-To: topic/233499.s25r44@meta.discourse.org

Yes, again with the caveat that there should not be a “topic” reference.
As expected, there is a reference to the OP message-id. Though it should
be the same message-id that sam sees for the OP.

discourse (as cameron, from the above email) creates post 101

sam is sent an email from discourse with these headers:

Message-ID: topic/233499/101.s44r78@meta.discourse.org

References: 43585349859734@test.com topic/233499@meta.discourse.org

In-Reply-To: 43585349859734@test.com

And here it goes wrong. The Message-ID should be
43585349859734@test.com from the .incoming_post.message_id field.
(Well, in my mind this is post.message_id(), which returns
post.incoming_post.message_id for an email generated post and your
Discourse generated one otherwise).

Consider: I compose and send my reply with message-id
43585349859734@test.com. For continuity reasons, I keep a copy of that
in my local folder, where it shows as a reply to the OP. Ideally
Discourse also sends me a copy of my own post (this is a policy setting
on many mailing lists), so I get Discourse’s version also. That should
have the same message-id, because it is the same message, just via a
different route.

Discourse’s message is not “in reply to” my message. It is my
message, just forwarded.

This effect cascades through your following examples. The actual process
should be simpler than you’ve made it.

Think of it this way. If I reply to a post from email, it effectively is
like me emailing sam (and the others) via Discourse. Discourse
forwards my message to the email-receiving subscribers, and
“incidentally” keeps a copy on the forum

As a side note, I also looked into what GitHub do with their
notification emails, and noticed they do a similar thing where they
have an ever-present Reference
(discourse/discourse/pull/252@github.com) that is used in all the
emails related to that “topic” which in this case is a GitHub pull
request:
References: <discourse/discourse/pull/252@github.com> <discourse/discourse/pull/252/issue_event/7042100517@github.com>
In-Reply-To: <discourse/discourse/pull/252/issue_event/7042100517@github.com>

Hoo, github. What a disaster their issue emails are

However, in their scenario, the PR is the OP. So a reference directly
to the pull is sane. You could use the “topic” message-id for post 1,
provided you didn’t also use the “topic/1” id as well. But there seems
little point - it is extra effort to special case post 1 - I’d just use
“topic/1” myself.

To add some complication. As I understand it, an admin can move a post
or topic. Doesn’t that break the “generate the message-id” scheme,
particularly if they move just a post? I’m somewhat of the opinion that
every post should have a _message_id field, filled in from the
incoming message (from email) or generated (posting via Discourse). Then
it is persistent and stable and robust against any shuffling of posts or
changes of algorithm.

Finally, there’s a small security consideration: you should ignore the
inbound email message-id (and potentially bounce the message) if it
claims the message-id of an existing post. Since as an author, I can put
anything I like in that header I’d go with just dropping the
message-id - accept the post, but don’t let it lie about being some
other post - give your copy the Discourse-generated id and then proceed
as normal.

martin · July 25, 2022, 12:18am

Wow thanks again for this wonderfully in-depth response. It will probably take a little while for me to process this and turn into actionable items, so please bear with us (as well as this I have some high priority other internal projects I am currently working on). I think with this information we will be able to make our threading systems a lot more robust and to-spec. I may have more questions as I go through your post, thanks Cameron.

cameron-simpson · July 25, 2022, 2:59am

By Martin Brennan via Discourse Meta at 25Jul2022 00:28:

Wow thanks again for this wonderfully in-depth response. It will
probably take a little while for me to process this and turn into
actionable items, so please bear with us (as well as this I have some
high priority other internal projects I am currently working on). I
think with this information we will be able to make our threading
systems a lot more robust and to-spec. I may have more questions as I
go through your post, thanks Cameron.

Sure. Cheers, Cameron Simpson

cameron-simpson · July 25, 2022, 3:02am

BTW, I notice this followup post from you has these headers:

Message-ID: <topic/233499/1137586.d14eea2849d76c355ec214fb@meta.discourse.org>
In-Reply-To: <YttEVzlTh/ymDSPT@cskk.homeip.net>
References: <topic/233499@meta.discourse.org>
      <YttEVzlTh/ymDSPT@cskk.homeip.net>

i.e. it has preserved my original email message-id. So the In-Reply-To
is correct, and the References at least has my email message-id in it.

That wasn’t what we were observing over at discuss.python.org.

Cheers,
Cameron Simpson

martin · July 26, 2022, 12:17am

Ah that is an interesting observation, I hadn’t noticed the little arrow.

This is also super interesting. I believe (without examining the source) Thunderbird does do that, and likely the Gmail UI as well since it does the same thing.

We do seem to be doing this but I guess not consistently? Basically we need to make sure that:

TODO #1 - If a post has an associated IncomingEmail record, we always use that Message-ID when sending email.

TODO #2 - Do not use a References when sending out emails related to the OP of the topic . @cameron-simpson one question though – if the OP was created via an inbound email, would we use that Message-ID in References for the OP or still exclude it?

This is interesting, I thought every recipient of the email had to have a unique Message-ID? In fact I believe this is why we went down the path of adding uniqueness to each recipient’s Message-ID, to avoid spam behaviours, looking back on our internal topic. Perhaps @supermathie , who is on our infra team and was doing a bunch of testing with email earlier in the year, could weigh in here too?

What you are saying is that it’s more that the post should be the thing determining a single Message-ID for all recipients. So perhaps we just generate one for each post that generates an email? Then we could also move the IncomingEmail.message_id to here as well. Tentatively, the change we would need to make is:

TODO #3 - Add a outbound_message_id to the Post table. Generate it once when an email is first sent in relation to the post. Use if for subsequent References and In-Reply-To headers. Set its value when a post is created from an IncomingEmail. Format should be topic/:topic_id/:post_id/:random_alphanumeric_string@host e.g. topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org

After this change my first example would become this:

martin creates the OP

cameron is sent an email with these headers:
- Message-ID: topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org
sam is sent an email with these headers:
- Message-ID: topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org

With the consideration also that the OP does not have special handling, it will no longer be in the format topic/:topic_id@hostname.

TODO #4 - Ensure that correct In-Reply-To and References headers are generated based on PostReply records and the new outbound_message_id column on the Post table

I think we have some consideration for this, I will double-check.

It definitely seems that way

Can you confirm the TODOs here sound reasonable Cameron? It really doesn’t seem like much now that I look at it. I also wonder, when I get to this work would you be open to joining a testing Discourse instance with me that will have the WIP changes deployed to it so we can email back and forth and test that things are working correctly? I will of course do testing of my own before I involve you.

If not, that’s fine too – I have Thunderbird and will be setting up mutt and I can test it all out there

sam · July 26, 2022, 2:24am

@cameron-simpson one thing I did want to clarify here is “message_id” scoping.

The thing that kicked off this whole dance was a strong suspicion by @supermathie that our non unique message_ids were causing issues.

Discourse generates unique emails per user for every email it sends. So for example say 2 users are watching this topic:

User 1 gets payload 1 with a distinct unsubscribe link directed at user 1
User 2 gets payload 2 with a distinct unsubscribe link directed at user 2

If in both cases our message id was say discourse_topic_100/23 (topic_id/post_number) then we will be telling MTAs out there that discourse_topic_100/23 can be 2 distinct payloads, the hypothesis is that they treat this as a spam signal

Hey Discourse … you just sent two emails called discourse_topic_100/23 what is up?

Since Discourse is in control of all email transport and emails are not added to a BCC list or CC list like traditional mailing lists, we can afford to have clean per user unsubscribe links.

What are your thoughts here? What about the simple change of using discourse_topic_100/23/7333 eg (topic_id, post_number, user_id) as the unique identifier for mail, it is certainly a unique payload and we can easily refer back to it when generating mails for a user.

cameron-simpson · July 26, 2022, 2:46am

By Martin Brennan via Discourse Meta at 26Jul2022 00:27:

[quote=“Cameron Simpson, post:11, topic:233499,
username:cameron-simpson”]
Mutt probably ignores the References
because it is a mail reader, and References originates in USENET news.
Maybe Thunderbird is using the references or augumenting the in-reply-to
with references information.

You only need to consult one of In=-Reply-To or References to do
threading; the former comes from email and the latter from USENET.
You’re supporting both (which is great!) so we need to make them
consistent.
[/quote]

This is also super interesting. I believe (without examining the source) Thunderbird does do that, and likely the Gmail UI as well since it does the same thing.

I think mutt will use both, but probably just In-Reply-To if present,
falling back to References. I’d need to check the source.

With References you do at least know the full chain to the OP; with
In-Reply-To you more or less need the antecedant messages around to
stitch things together. For mailing lists I usually keep the whole
thread locally until it’s done anyway, and I expect that is common.

Cameron Simpson:

But if
you’re passing a message on (authored in email), the existing message-id
should be preserved.

Cameron Simpson:

i.e. it has preserved my original email message-id. So the In-Reply-To
is correct, and the References at least has my email message-id in it.

Cameron Simpson:

No. You really want to pass through IncomingEmail.message_id as the
Message-ID in the email to johndoe. It’s the same message.

We do seem to be doing this but I guess not consistently? Basically we need to make sure that:

TODO #1 - If a post has an associated IncomingEmail record, we always use that Message-ID when sending email.

Yes. This is why I was thinking it might be sanest to have an explicit
field for the message-id, and to fill it in once. Then use that from
then on always, regardless of any changes to the process in which the
message-id is manufactured in the code later.

Cameron Simpson:

There should not be a References header in the OP. It isn’t
needed for threading and effectively pretends there’s some “post 0”
which doesn’t exist. It meeans every OP (a) looks like a reply, which it
is not and (b) looks like the thing to which it is a reply is missing
from the reader’s mailbox.

Cameron Simpson:

drop the “topic” pseudo-message-id from the References - it misleads
email clients (including TB, even if it isn’t visually evident)

TODO #2 - Do not use a References when sending out emails related to the OP of the topic .

Yes. The OP has no antecedant, so there’s no References or
In-Reply-To.

@cameron-simpson one question though – if the OP was created via an
inbound email, would we use that Message-ID in References for the
OP or still exclude it?

Still exclude. But use it as the persistent message-id for the OP.

So a message authored by email (OP or reply) gets its message-id from
the email. One authored on the web gets one when the user presses
Submit, generated by Discourse. From then on, that’s the message-id,
however created.

[quote=“Cameron Simpson, post:11, topic:233499,
username:cameron-simpson”]
You can just drop the sender_user_id and receiver_user_id from the
message-id field and get a single unique id which every receiver sees.

The uniqueness constraint is the post itself, not the individual
email-level “message” object.
[/quote]

Cameron Simpson:

your plan is good, provided you drop the sender and receiver from the
message-id - they’re unnecessary and in fact the receiver will cause
trouble (the sender is just redundant).

Cameron Simpson:

To add some complication. As I understand it, an admin can move a post
or topic. Doesn’t that break the “generate the message-id” scheme,
particularly if they move just a post? I’m somewhat of the opinion that
every post should have a _message_id field, filled in from the
incoming message (from email) or generated (posting via Discourse). Then
it is persistent and stable and robust against any shuffling of posts or
changes of algorithm.

This is interesting, I thought every recipient of the email had to have a unique Message-ID?

No. The message-id identifies the “message”. Not the individual copy. I
might post to the forum and CC someone directly. If that someone gets a
copy direct from me and also via the forum, they should have the same
message-id.

In fact I believe this is why we went down the path of adding
uniqueness to each recipient’s Message-ID, to avoid spam behaviours,
looking back on our internal topic. Perhaps @supermathie , who is on
our infra team and was doing a bunch of testing with email earlier in
the year, could weigh in here too?

Maybe. But on that face of it, threading is indeed broken. Certainly
sending the same message to many people should have the same message-id,
and generally, as a forwarder (email->discourse->email-recipients)
discourse shoud not be modifying the message-ids.

What you are saying is that it’s more that the post should be the thing determining a single Message-ID for all recipients. So perhaps we just generate one for each post that generates an email?

Every post should have one stable unique message-id for use in the email
side. If the post originated from an email, that original message-id
should be used. Otherwise (via the web interface) Discourse should be
generating a message-id and storing it with the post.

Then we could also move the IncomingEmail.message_id to here as well.

Sure. Having a distinct set of fields (message-id seems enough)
containing the email-side state should do it.

Tentatively, the change we would need to make is:

TODO #3 - **Add a outbound_message_id to the Post table. Generate
it once when an email is first sent in relation to the post.

If you got the post from an email, you should be using that, not
generating a new one.

Use if for subsequent References and In-Reply-To headers. Set its
value when a post is created from an IncomingEmail.

Yes. To the message-id from the email.

Format should be
topic/:topic_id/:post_id/:random_alphanumeric_string@host e.g.
topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org**

For ones you generate yourselves, this looks good to me.

After this change my first example would become this:

martin creates the OP

cameron is sent an email with these headers:

Message-ID: topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org

sam is sent an email with these headers:

Message-ID: topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org

Yes.

But note: the message-id only needs to be stable and unique. If the
topic/:topid_id/:post_id@host is stable and will never be regenerated,
that will do. But if you’re concerned about that (eg db restores or
migrations or imports bringing those same numbers) then the random
string will make it robust against collision.

Note that the message-id left part is dot-atom-text, defined here:

which is alphas and digits and a limited set of punctuation characters
(which includes “/”).

Um, your headers. They should have:

Message-ID: <topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org>

Note the angle brackets. The message-id is formally the bit between the
angle brackets, and the angle brackets are mandatory. Syntax here:

Cameron Simpson:

But there seems
little point - it is extra effort to special case post 1 - I’d just use
“topic/1” myself.

With the consideration also that the OP does not have special handling, it will no longer be in the format topic/:topic_id@hostname.

Sounds good.

Cameron Simpson:

generating correct In-Reply-To, which is easily computed from the
immediate antecedant message(s) i.e. antecedant(s)-Message-ID

generating correct References, which is easily computed as
antecedant-References + antecedant-Message-ID

TODO #4 - Ensure that correct In-Reply-To and References headers are generated based on PostReply records and the new outbound_message_id column on the Post table

Thanks.

Cameron Simpson:

Finally, there’s a small security consideration: you should ignore the
inbound email message-id (and potentially bounce the message) if it
claims the message-id of an existing post. Since as an author, I can put
anything I like in that header I’d go with just dropping the
message-id - accept the post, but don’t let it lie about being some
other post - give your copy the Discourse-generated id and then proceed
as normal.

I think we have some consideration for this, I will double-check.

+1

Cameron Simpson:

This effect cascades through your following examples. The actual process
should be simpler than you’ve made it.

It definitely seems that way

Can you confirm the TODOs here sound reasonable Cameron?

They seem correct to me.

It really doesn’t seem like much now that I look at it. I also wonder,
when I get to this work would you be open to joining a testing
Discourse instance with me that will have the WIP changes deployed to
it so we can email back and forth and test that things are working
correctly? I will of course do testing of my own before I involve you.

Certainly. Happy to help in whatever way.

If not, that’s fine too – I have Thunderbird and will be setting up
mutt and I can test it all out there

I can help you with mutt if you want it too.

cameron-simpson · July 26, 2022, 3:50am

I think you can still send distinct messages with the same message-id, even with small differences like this.

Ordinary mailing lists do this all the time to a greater or lesser degree. At the least some header mucking around always happens. But the message body is also sometimes modified. An egrarious example is python-list, which discards not-text attachments. The message goes through with the same message-id though. And almost all lists put a rider at the bottom with, say, a link to the list admin page or an unsubscribe link. That will not have been on the message when it arrived.

And there have been long discussions on content signing which revolve around what should be covered by a signature.

So I’d be entirely ok with you adding your recipient-specific unsub link and preserving the original message-id. The benefits far far outweigh the loss of threading if you gave each message copy an individual message-id.

Again, consider the email user. I can reply to a discourse message and add a CC to an interested outside person. Maybe they get a copy from discourse, maybe not. But if they did, it should have the source message-id on it even with your additional rider. Otherwise they’re got 2 copies of my message, but their mail system doesn’t know they’re copies of the one message. Badness ensues.

So in short: I do not think that your very minor additional unsub text warrants distinct message-ids. Keep just the one.

supermathie · July 27, 2022, 1:56pm

Sorry, I’m just catching up now, here are some thoughts, some of which have already been addressed…

The difficulty here is that what is sent out from Discourse is a different message than the inbound. It has different metadata (for this purpose, To/From/Reply-to/Unsubscribe/etc.) and a different body (it’s customised per user (I think? Does this not happen in mailling list mode?)).

What exactly is the message? Treating 5322 as gospel:

A message consists of header fields, optionally followed by a message body.

The “Message-ID:” field provides a unique message identifier that refers to a particular version of a particular message.

[emphasis mine]

It’s that “particular version” that makes me think it would be inappropriate to re-send an incoming message with a different Message-ID. Though, if you change your point of view from Discourse as “Forum Software” to Discourse being “Mailing List Software” then it kind of makes sense to do so, so I get where you’re coming from. 5322 also says:

There are many instances when messages are “changed”, but those changes do
not constitute a new instantiation of that message, and therefore the message
would not get a new message identifier. For example, when messages are
introduced into the transport system, they are often prepended with
additional header fields such as trace fields (described in section 3.6.7)
and resent fields (described in section 3.6.6). The addition of such header
fields does not change the identity of the message and therefore the original
“Message-ID:” field is retained. In all cases, it is the meaning that the
sender of the message wishes to convey (i.e., whether this is the same
message or a different message) that determines whether or not the
“Message-ID:” field changes, not any particular syntactic difference that
appears (or does not appear) in the message.

I suppose it comes down to, does the sender of the message change when Discourse sends it out?

Maybe we should use Resent-Message-ID and friends?

It’s always been there, all the back to 822. But as you say later, yes it’s been updated.

5322 also speaks directly to the way Discourse and Github use it:

The “In-Reply-To:” field may be used to identify the message (or messages) to
which the new message is a reply, while the “References:” field may be used to
identify a “thread” of conversation.

Possibly slightly improperly, likely due to the lack of a suitable “Thread Identifier” header. But this interpretation may not be what the RFC authors intended… it doesn’t address messages with a “References” but without “In-Reply-To”.

The tricky bit of this is that we aren’t sending out one email, we’re sending out N - one per recipient - so that their individual metadata (Unsubscribe, etc.) can be correct.

And yes, I did see strong indications during testing that spam determination would be tied to a Message-ID. If it was later seen again (same user or different user) it would be much more likely to be marked spam.

The benefits here, to be fair, are entirely around threading the emails correctly in certain mail clients at the expense of deliverability.

The current topic/#{topic_id}/#{post_id}.s#{sender_user_id}r#{receiver_user_id} at least makes it consistent for a user in their mailbox. The assumption

My biggest concern is the deliverability - it’s hard enough to get email delivered when there is zero visibility from the major providers.

But I do see a strong argument for making Discourse behave more like mailing list software in mailing list mode. @martin I believe we don’t customise the message body in mailing list mode? Do you think it makes sense to take a more strict approach around preserving and reusing Message-IDs in mailing list mode?

sam · July 27, 2022, 9:22pm

I don’t want to be in a situation where perfect is the enemy of good enough here.

github.com

discourse/discourse/blob/368251347547345e888a023a09e94cc999ced327/lib/email/message_id_service.rb#L57-L57


      
          "<topic/#{topic.id}.#{random_suffix}@#{host}>"

We use “random suffix” now in messages and this is unquestionably causing pain.

We have 3 options on the table:

Random message ids that can not be referred back to
Message ids stable per topic/post/user
Message ids stable per topic/post pair

We are currently in planet (1) which is wreaking havoc.

I worry that we can reach decision paralysis between (2) and (3).

Perhaps we simply start with (2) acknowledging that adding extra ccs to an email from Discourse may cause unexpected behavior, and at least stop the majority of the pain here?

supermathie · July 27, 2022, 10:27pm

ah! I thought we were already doing: topic/#{topic_id}/#{post_id}.s#{sender_user_id}r#{receiver_user_id}

I would be inclined to, in the interest of balancing concerns of email uniqueness & deliverability vs. those of mailing-list-mode, do (2) for mailinglist-mode disabled and (3) for mailinglist-mode enabled.

Similarly, with the References header, I would be inclined to have it absent for post #1 in a topic and have it referencing the topic (so topic/#{topic_id}) and the post to which it’s replying, if any.

Topic		Replies	Views
Discourse Emails not threaded properly in some Email clients Support	13	4932	June 16, 2022
Emails are not threaded in Outlook 2013 Bug	31	14439	January 9, 2015
Threading for email-only topics seems broken Support	7	1230	October 24, 2023
Email-in replies thread wrongly Bug	18	6487	June 23, 2017
Email threading broken Bug	8	761	July 29, 2022

Discourse email messages are incorrectly threaded

Related topics