Threading for email-only topics seems broken


(Greg) #1

Hi all,

I’m currently preparing a migration from Google Groups to Discourse (yep, another one, love the importer by the way). Hitting a slight snag with threading on topics created by emails though. Let me see if I can explain… (IDs rewritten due to link limits)

Let’s say I have two users, A and B, and I’m watching the “list” via my inbox.

  • User A starts a new topic via email. His client sends an email which I see in our POP3 mailbox, with
    • ID <foo at gmail dot com>
    • No references / in-reply-to (as you’d expect)
  • Discourse mails me with the topic, we use Mailjet so I get an email with ID <bar at mailjet dot com>

So now I have 1 email in my inbox, so far so good.

  • User B replies by email, and in our POP3 box I see a mail with:
    • ID <quux at yahoo dot com>
    • In-reply-to / References <bar at mailjet dot com>
  • Discourse mails this reply to me, and in my inbox I get:
    • ID <\cat at mailjet dot com>
    • In-reply-to / References: <foo at gmail dot com>

This breaks threading, because the in-reply-to and references on the second email are to a mail I’ve never received (ID <foo at gmail dot com>). So, my client assumes they’re independent.

Is anyone else seeing this? I think I’ve fixed it with the patch below (which I’d be happy to make a PR for if wanted) but I’d like to know if I’m missing something, or if there’s a better fix we could do?

diff --git a/lib/email/sender.rb b/lib/email/sender.rb
index 08e141a..f325078 100644
--- a/lib/email/sender.rb
+++ b/lib/email/sender.rb
@@ -106,10 +107,12 @@ module Email
         # https://www.ietf.org/rfc/rfc2822.txt
         if post.post_number == 1
           @message.header['Message-ID']  = topic_message_id
+          @message.header['References']  = "<topic/#{topic_id}@#{host}>"
         else
           @message.header['Message-ID']  = post_message_id
           @message.header['In-Reply-To'] =
referenced_post_message_ids[0] || topic_message_id
-          @message.header['References']  = [referenced_post_message_ids, topic_message_id].flatten.compact.uniq
+          @message.header['References']  = [referenced_post_message_ids, topic_message_id, "<topic/#{topic_id}@#{host}>"].flatten.compact.uniq
         end

         # https://www.ietf.org/rfc/rfc2919.txt

(yes, I know I should put that string in a variable or something, it’s just a testing hack)

Basically, filling in References even for the first post in a thread isn’t (I think) against RFC2822, so we add the topic ID at all times so there’s always a consistent ID. Seems to work in my limited tests. Is that sensible, or entirely insane? :slight_smile:


(Jeff Atwood) #2

This would be a regression; we had bugs opened against us for having a ‘references’ on the first post in the topic. Searching around here should reveal the topic.


(Greg) #3

Thanks for the reply @codinghorror. I’ve had a search around, I think this is the thread you’re referencing

So, OK, I accept that adding References / In-Reply-To to the first post is a hack (although it’s entirely RFC legal - consider what happens when you join a thread halfway through, your client has no history of the prior emails). I can probably drop that part of the patch.

However, I still need to address the larger point in some way - the headers contain IDs that cannot ever have existed in the mail client of any user (except the sender). I see that in the above thread @zogstrip added the patch that moved from using <topic/#id/#post_id> to using the original incoming mail header - but I must be misunderstanding how that’s supposed to work, because it appears to cause the issue outlined in the first post. I think just using line 97 instead of the if/then/else would solve my problem, but I’m sure there’s more to it.

What did I miss?


(Greg) #4

I’ve been digging into this a little more, and I think Mailjet is messing with the outgoing email for the start of the thread. I applied this patch:

diff --git a/lib/email/sender.rb b/lib/email/sender.rb
index 08e141a..be770dd 100644
--- a/lib/email/sender.rb
+++ b/lib/email/sender.rb
@@ -79,29 +79,34 @@ module Email
         topic = Topic.find_by(id: topic_id)
         first_post = topic.ordered_posts.first
 
-        topic_message_id = first_post.incoming_email&.message_id.present? ?
-          "<#{first_post.incoming_email.message_id}>" :
-          "<topic/#{topic_id}@#{host}>"
+        topic_message_id = "<topic/#{topic_id}@#{host}>"
 
-        post_message_id = post.incoming_email&.message_id.present? ?
-          "<#{post.incoming_email.message_id}>" :
-          "<topic/#{topic_id}/#{post_id}@#{host}>"
+        post_message_id = "<topic/#{topic_id}/#{post_id}@#{host}>"
 
         referenced_posts = Post.includes(:incoming_email)
           .where(id: PostReply.where(reply_id: post_id).select(:post_id))
           .order(id: :desc)
 
         referenced_post_message_ids = referenced_posts.map do |post|
-          if post.incoming_email&.message_id.present?
-            "<#{post.incoming_email.message_id}>"
-          else
           if post.post_number == 1
             "<topic/#{topic_id}@#{host}>"
           else
             "<topic/#{topic_id}/#{post.id}@#{host}>"
           end
         end
-        end
+
+       # Debugging
+       Rails.logger.info("Greg: " + topic_message_id)
+       Rails.logger.info("Greg: " + post_message_id)
+       Rails.logger.info("Greg: " + referenced_post_message_ids.join(','))
 
         # https://www.ietf.org/rfc/rfc2822.txt
         if post.post_number == 1

So, my debugging shows that the outgoing email does have <topic/7484@myinstance.org> as the message ID, but the one that lands in my inbox has ID uuid@mailjet.com. Not sure why that’s happening, but I’ll open a ticket with mailjet and check. If I can’t resolve it, then hacking the topic into References may be my only option.

Any ideas for other debugging I might need to do are gratefully received :wink:


(Greg) #5

Confirmed, here’s my reply from Mailjet:

I have checked with our Technical team level 2 about this and unfortunately, it’s impossible to change this rewriting message ID since this is how the system works when you send through us.

I’ve rebuilt the main app with Mailgun SMTP credentials and the headers are fine now. Sorry for the noise - but this might be a useful heads up for any one else on a Mailjet account.