رسائل البريد الإلكتروني لـ Discourse يتم ربطها بشكل غير صحيح

cameron-simpson · 21 يوليو 2022، 6:50ص

في discuss python org، نناقش الجانب المتعلق بالبريد الإلكتروني في Discourse. أكبر شكوى هي عدم وجود سلاسل مترابطة. لقد بحثت قليلاً في الرؤوس ويبدو أن:

رأس Message-ID فريد على الأقل
رؤوس Reply-To و References لا تشير إلى Message-IDs لرسائل أخرى، ناهيك عن معرف الرسالة التي تم الرد عليها
بدلاً من ذلك، تشير إلى معرف رسالة وهمي يعتمد على رقم الموضوع

هذا يعني أن الأشخاص الذين يستخدمون البريد الإلكتروني يرون (أ) مناقشات مسطحة تمامًا غير مترابطة و (ب) يبدو أن الرسالة الجذرية مفقودة، لأن رؤوس In-Reply-To و References تشير إلى معرف رسالة لا يظهر فعليًا في أي رسالة.

هذا سيء، وينتهك RFC 5322. ويجعل تجربة البريد الإلكتروني أسوأ بكثير مما يمكن أن تكون عليه بسهولة.

على سبيل المثال، هناك سلسلة مترابطة هناك تحتوي رسالتها الأولى على الرؤوس التالية:

Message-ID: <topic/17208.dc83577b18fc3ecc438ed42a@discuss.python.org>
References: <topic/17208@discuss.python.org>

إنها الرسالة الأولى. لا ينبغي أن تحتوي على رأس References، لأنه لا توجد رسالة في أي مكان بهذا المعرف.

الرسالة الثانية تحتوي على الرؤوس التالية:

Message-ID: <topic/17208/60568.898edf234f56cf6f3a661c1a@discuss.python.org>
In-Reply-To: <topic/17208@discuss.python.org>
References: <topic/17208@discuss.python.org>

مرة أخرى، Message-ID مقبول، ولكن In-Reply-To و References غير منطقيين تمامًا.

يجب أن يكون هذا سهلاً الإصلاح. يجب ألا تحتوي الرسالة الأولى على رؤوس In-Reply-To أو References. يجب أن تحتوي الرسالة الثانية على Message-ID الخاص بالرسالة الأولى في رؤوس In-Reply-To و References.

يرجى الرجوع إلى القسم 3.6.4 من RFC5322 للحصول على التفاصيل:

كما هي الحال، يرى مستخدمو البريد الإلكتروني مناقشات مسطحة غير منظمة. مع هذه الإصلاحات، يمكنهم الحصول على عرض مترابط منطقي وسهل المتابعة.

ericsnowcurrently · 21 يوليو 2022، 3:26م

في حال كان أي شخص مهتمًا، فإن أرشيف المناقشة التي يشير إليها كاميرون موجود على \u003chttps://mail.python.org/archives/list/python-dev@python.org/message/VHFLDK43DSSLHACT67X4QA3UZU73WYYJ/\u003e.

RGJ · 21 يوليو 2022، 3:59م

يبدو أن هذا تراجع، انظر الموضوع القديم والإصلاح.

cameron-simpson · 21 يوليو 2022، 10:50م

ألقي نظرة على الفرق بين HEAD وهذا الإصلاح.

يبدو لي أن الحالي لا يزال يضبط دائمًا References، حتى لو لم يكن هناك سابقة - يتم استخدام topic_canonical_reference_id كبديل. ما زلت أعتقد أن هذا خطأ، لأنه لا توجد رسالة بريد إلكتروني بهذا المعرف.

In-Reply-To أكثر صحة قليلاً، من حيث أنه يتم تعيينه فقط إذا كان post.post_number!=1، ولكنه لا يزال يعود إلى topic_canonical_reference_id:

@message.header['In-Reply-To'] = referenced_post_message_ids[0] || topic_canonical_reference_id

يبدو أن هذا به مشكلتان في نظري:

يجب أن يكون البديل هو Message-ID للمنشور رقم 1 إذا لم تكن هناك referenced_post_message_ids، وليس topic_canonical_reference_id
يجب أن يقوم شيء ما في كود receipt-of-reply-emails بإسقاط رأس In-Reply-To لرسائل الرد، لأنها كان يجب أن تملأ بشكل صحيح مصفوفة referenced_post_message_ids (“قائمة”؟ أنا جديد على Ruby)

martin · 22 يوليو 2022، 12:25ص

كامرون، شكراً لك على فتح هذا الموضوع للنقاش وتقديم الكثير من التفاصيل في منشوراتك. أنا مسؤول عن هذا “الوضع المعقد”، من هاتين الالتزامين:

github.com/discourse/discourse

FIX: Add random suffix to outbound Message-ID for email (#15179)

committed 12:34AM - 06 Dec 21 UTC

martin-brennan

+304 -104

Currently the Message-IDs we send out for outbound email are not unique; for a …post they look like: topic/TOPIC_ID/POST_ID@HOST And for a topic they look like: topic/TOPIC_ID@HOST This commit changes the outbound Message-IDs to also have a random suffix before the host, so the new format is like this: topic/TOPIC_ID/POST_ID.RANDOM_SUFFIX@HOST Or: topic/TOPIC_ID.RANDOM_SUFFIX@HOST This should help with email deliverability. This change is backwards-compatible, the old Message-ID format will still be recognized in the mail receiver flow, so people will still be able to reply using Message-IDs, In-Reply-To, and References headers that have already been sent. This commit also refactors Message-ID related logic to a central location, and adds judicious amounts of tests and documentation.

github.com/discourse/discourse

FIX: Canonical Message-ID was incorrect for some cases (#15701)

committed 12:36AM - 03 Feb 22 UTC

martin-brennan

+37 -7

When creating a direct message to a group with group SMTP set up, and adding an…other person to that message in the OP, we send an email to the second person in the OP via the group_smtp job. This in turn creates an IncomingEmail record to guard against IMAP double sync. The issue with this was that this IncomingEmail (which is essentialy a placeholder/dummy one) was having its Message-ID used as the canonical References Message-ID for subsequent emails sent out to user_private_message recipients (such as members of the group), causing threading issues in the mail client. The canonical <topic/ID@HOST> format should be used instead for these cases. This commit fixes the issue by only using the IncomingEmail for the OP's Message-ID if the OP was created via our handle_mail email receiver pipeline. It does not make sense to use it in other cases.

لقد كنا على علم ببعض المشكلات المتعلقة بالترابط في عملاء البريد الإلكتروني مثل Thunderbird لفترة من الوقت، ولكن لم يكن هناك عدد كبير من مستهلكي ترابط البريد الإلكتروني من Discourse، لذلك تم تأجيل الأمر، ولكن الآن بعد أن ظهر هذا الأمر، نحتاج إلى قضاء بعض الوقت في إعادة فحص المشكلة والعمل على حل.

ومن المثير للاهتمام أننا أضفنا ترويسة References هذه إلى البريد الإلكتروني الأول المرسل وكل بريد لاحق في ذلك الوقت، حيث إنها تجعل الترابط يعمل بشكل صحيح في Gmail، ولكني أتفق على أنها ليست مثالية ومن المحتمل أنها تسبب مشكلات الترابط إلى جانب عدم استخدام Message-ID الأصلي في ترويسات In-Reply-To و References للبريد الإلكتروني اللاحق.

يرجى التحلي بالصبر معي أثناء مراجعتي للمناقشات القديمة والكود والعمل على حل هذه المشكلة. في هذه الأثناء، هل أنت على علم بعملاء بريد إلكتروني آخرين يتم استخدامهم ويواجهون مشكلات؟ على سبيل المثال، أعرف أن هذه مشكلة في Thunderbird، ولكن ماذا عن أي عملاء آخرين؟ شكراً.

cameron-simpson · 22 يوليو 2022، 3:27ص

لقد كتبت ردًا طويلاً، لكنني تلقيت:

نأسف، ولكن رسالة البريد الإلكتروني الخاصة بك إلى
["incoming+8349bd9eb1f2b582df4f32dbe85c3363@meta.discoursemail.com"]
(بعنوان Re: [Discourse Meta] [bug] Discourse email messages are
incorrectly threaded) لم تنجح.

السبب:
عذرًا، يمكن للمستخدمين الجدد فقط وضع رابطين في منشور.
إذا كان بإمكانك تصحيح المشكلة، فيرجى المحاولة مرة أخرى.

سأقوم بوضعها في المنتدى حيث يمكنني التقاطها ومراجعتها…

cameron-simpson · 22 يوليو 2022، 3:34ص

كاميرون، شكراً لك على فتح هذا الموضوع للنقاش وتقديم
الكثير من التفاصيل في منشوراتك. أنا مسؤول عن هذا الأمر المعقد،
من هاتين الالتزامين:

3b13f1146b2a406238c50d6b45bc9aa721094f46

هذا يبدو جيدًا. هل يقوم بحفظ هذا المعرف مع سجل قاعدة البيانات بحيث يمكن ربط الردود الواردة بالرسالة الأصلية للمنتدى؟

أيضًا، هل تريد مني التحقق من أن اللاحقة قانونية نحويًا لـ RFC5322، من حيث الأحرف المسموح بها؟

82cb67e67b83c444f068fd6b3006d8396803454f

يبدو أن هذا الالتزام الثاني يعالج مشكلة أخرى واجهناها: إذا جاء منشور من بريد إلكتروني، فإن معرف الرسالة الصادر المرسل إلى مستخدمي البريد الإلكتروني ليس هو معرف الرسالة للرسالة المصدر من المؤلف. ينتج عن هذا رسالتان مختلفتان من وجهة نظر عميل البريد الإلكتروني، ومن المحتمل أن يؤدي إلى تعطل الردود على الرسالة الأصلية بدلاً من النسخة المرسلة من المنتدى. على سبيل المثال:

إلى: المنتدى
النسخة الكربونية: أحد المشاركين

سيتلقى المشارك (حسنًا، قد يتلقى) نسخة من المنتدى ونسخة مباشرة من المؤلف، وستكون هاتان رسالتان مميزتان لديهما لأنهما ستحملان معرفات رسائل مختلفة.

كنت سأقوم بإنشاء تقرير خطأ ثانٍ حول هذه المشكلة بعد فرز مشكلة رؤوس in-reply-to و references، وهي أكثر أهمية بكثير.

لقد كنا على دراية ببعض المشكلات المتعلقة بالترابط لبعض الوقت الآن في عملاء البريد الإلكتروني مثل Thunderbird ولكنها لم تمثل عددًا كبيرًا من مستهلكي
ترابط البريد الإلكتروني من Discourse لذلك تم تأجيلها، ولكن الآن بعد أن ظهر هذا، نحتاج إلى قضاء بعض الوقت في إعادة فحص المشكلة والعمل على حل.

أنا والعديد من الآخرين نستخدم mutt. أنا سعيد بالقيام بأي شيء مطلوب للمساعدة
في تصحيح الأخطاء ومراجعة الكود. لقد كنت أيضًا مسؤول نظام بريد إلكتروني لسنوات عديدة في حياتي السابقة.

[quote=“Cameron Simpson, post:1, topic:233499,
username:cameron-simpson”]
إنها الرسالة الأولى. لا ينبغي أن تحتوي على رأس References، لأنه لا توجد رسالة في أي مكان بهذا المعرف.
[/quote]

ومن المثير للاهتمام، أننا أضفنا رأس References هذا إلى البريد الإلكتروني الأول المرسل وكل بريد لاحق منذ ذلك الحين لأنه يجعل الترابط يعمل بشكل صحيح في Gmail،

أعتقد أن رأس References الصحيح (غير موجود في المنشور الأول، مثل
in-reply-to في الردود) يجب أن يعمل أيضًا. لكن Gmail لديه علاقة فضفاضة إلى حد ما بمعايير البريد الإلكتروني في بعض الأحيان. لدي اتفاقية Gmail؛ يمكنني القيام ببعض تصحيح الأخطاء هناك أيضًا. ومن حيث المبدأ، يمكننا استخدام هذا النقاش نفسه كساحة اختبار، ربما.

ولكنني أتفق على أنه ليس مثاليًا ومن المحتمل أن يتسبب في مشكلات الترابط
إلى جانب عدم استخدام Message-ID الأصلي في رسائل البريد الإلكتروني اللاحقة
رؤوس In-Reply-To و References.

يرجى التحلي بالصبر معي بينما أبحث في المناقشات القديمة والكود وأعمل على حل هذه المشكلة.

لا مشكلة.

في هذه الأثناء، هل أنت على علم بعملاء بريد إلكتروني آخرين يتم
استخدامهم ويواجهون مشكلات؟ على سبيل المثال، أعرف أن هذه مشكلة في Thunderbird، ولكن ماذا عن أي عملاء آخرين؟ شكرًا.

بالتأكيد mutt. على الأقل مع mutt، من السهل جدًا رؤية الرؤوس
ورؤية سلسلة شجرة الردود، والتي غالبًا ما تكون محجوبة في العملاء الآخرين.

ترابط البريد الإلكتروني معرف بالكامل بواسطة رؤوس Message-ID و In-Reply-To. بدأ رأس References مع USENET للمتابعات، ودعم (هناك) معرفات رسائل متعددة؛ يدعم In-Reply-To معرفًا واحدًا فقط. يبدو أن References موجود أيضًا الآن في RFC5322، وسأتحقق من دلالاته.

martin · 22 يوليو 2022، 4:41ص

أنا فقط أجمع أفكاري في منشور كبير حول هذا لاحقًا اليوم، شكرًا لك على المعلومات الإضافية حتى الآن!

martin · 22 يوليو 2022، 6:24ص

حسنًا، هذا أمر ضخم نوعًا ما، يرجى التحلي بالصبر معي. أولاً، شكرًا على رد آخر مفصل وعرض تصحيح الأخطاء / المراجعة، إنه مفيد حقًا لقد كنت أبحث في هذا هذا الصباح، وبشكل مفاجئ، يعمل تنظيم الرسائل في عرض موحد في Thunderbird في معظم الحالات، وأعتقد أن رأس References الذي يشير باستمرار إلى الموضوع الأصلي يساعد في ذلك (على سبيل المثال، الموضوع Reference في هذه السلسلة الموجود دائمًا هو \u003ctopic/53@discoursehosted.martin-brennan.com\u003e).

الحالة التي لا يعمل فيها تنظيم الرسائل كما هو مقصود هي:

يتم إنشاء منشور داخل Discourse وإرسال بريد إلكتروني إلى أولئك الذين يتابعون الموضوع ثم
يرد شخص آخر على هذا المنشور ويتم إرسال بريد إلكتروني إلى أولئك الذين يتابعون الموضوع

في حالة البريد الإلكتروني الثاني، يحصل على رأس In-Reply-To و References غير صحيحين لأنه ينشئ واحدًا في هذا السطر discourse/lib/email/sender.rb at 98bacbd2c6b9fe57167cd32af5eb4839b4a5d1f6 · discourse/discourse · GitHub بدلاً من استخدام واحد موجود. يجب أن يستخدم Message-ID للبريد الإلكتروني الذي تم إرساله أولاً. في لقطة الشاشة، هذا هو المكان الذي يجب أن توضع فيه الرسائل التي تتبع هذا النمط:

الجواب هو - يعتمد. إذا تم إنشاء منشور في Discourse من بريد إلكتروني وارد، مثل هذا الخاص بك، فإننا نستخدم Message-ID الأصلي للبريد الإلكتروني الوارد عند الرد عليه لرؤوس In-Reply-To و References كما هو موضح في:

github.com/discourse/discourse

lib/email/sender.rb

98bacbd2c


      
          if referenced_post.incoming_email&.message_id.present?
            "<#{referenced_post.incoming_email.message_id}>"

بخلاف ذلك، فإننا نستخدم فقط مرجع الموضوع الأصلي وننشئ مرجعًا جديدًا، وهو بالطبع ما يسبب كل المشاكل. في جميع الحالات، نقوم بإنشاء Message-ID جديد في كل مرة يتم فيها إرسال بريد إلكتروني صادر، ويبدو هذا صحيحًا ومتوافقًا مع عملاء البريد الآخرين.

أعتقد أنني أفهم ما تقصده، هل يحدث ذلك كالتالي:

يرسل كاميرون بريدًا إلكترونيًا إلى Discourse من mutt يحصل على Message-ID: 74398756983476983@mail.com
ينشئ Discourse منشورًا ويخزن Message-ID مقابل المنشور مع سجل IncomingEmail
يتابع جون دو الموضوع، لذلك يتم إرسال بريد إلكتروني إليه من Discourse مع Message-ID: topic/222/44@discourse.com ولا يوجد مرجع إلى Message-ID: 74398756983476983@mail.com الأصلي.

هل هذا يبدو صحيحًا، أننا يجب أن “نمرر” هذا Message-ID لمن يتابعون الموضوع بدلاً من إنشاء معرف خاص بنا لأنه فريد بالفعل؟ ماذا يحدث بعد ذلك في عميل البريد الإلكتروني لـ johndoe إذا قام كاميرون أيضًا بنسخه في هذا البريد الإلكتروني الصادر الأصلي؟ يبدو هذا مشكلة منفصلة، لذا سيكون من الجيد فتح موضوع خطأ آخر لها.

سأقوم بإعداد عميل mutt محليًا لمعرفة ما تراه أيضًا، لم أختبر هذه الوظيفة مطلقًا في عميل نصي (فقط Gmail و Thunderbird) لذا أنا حريص على رؤية كيف تبدو على أي حال.

خط تفكيري لمعالجة هذه المشكلات هذا الصباح كان التخلص من اللواحق التي تم إنشاؤها عشوائيًا عند إرسال رؤوس Message-ID في رسائل البريد الإلكتروني وبدلاً من ذلك التغيير إلى مخطط نستخدم فيه user_id لكل من المستخدم المرسل والمستقبل. فائدة هذا هي أنه لا حاجة لتخزين Message-ID في أي مكان (باستثناء عندما ينشئ بريد إلكتروني وارد منشورًا) وبالتالي ستكون رؤوس References و In-Reply-To متسقة دائمًا. دعني أعطي مثالاً. لنفترض أن لدينا هؤلاء المستخدمين:

مارتن - user_id 25
كاميرون - user_id 44
سام - user_id 78
بوب - user_id 999

ولدينا هذا الموضوع، topic_id 233499، مع منشورات تبدأ من post_id 100 كموضوع أصلي. سيصبح التنسيق topic/#{topic_id}/#{post_id}.s#{sender_user_id}r#{receiver_user_id}. سيبدو ترتيب العمليات كالتالي:

مارتن ينشئ الموضوع الأصلي

يتم إرسال بريد إلكتروني إلى كاميرون مع هذه الرؤوس:
Message-ID: topic/233499.s25r44@meta.discourse.org
References: topic/233499@meta.discourse.org
يتم إرسال بريد إلكتروني إلى سام مع هذه الرؤوس:
Message-ID: topic/233499.s25r78@meta.discourse.org
References: topic/233499@meta.discourse.org

كاميرون يرد عبر البريد الإلكتروني

يتم إرسال بريد إلكتروني إلى Discourse مع هذه الرؤوس من mutt:
Message-ID: 43585349859734@test.com
References: topic/233499@meta.discourse.org topic/233499.s25r44@meta.discourse.org
In-Reply-To: topic/233499.s25r44@meta.discourse.org

Discourse (باسم كاميرون، من البريد الإلكتروني أعلاه) ينشئ المنشور 101

يتم إرسال بريد إلكتروني إلى سام من Discourse مع هذه الرؤوس:
Message-ID: topic/233499/101.s44r78@meta.discourse.org
References: 43585349859734@test.com topic/233499@meta.discourse.org
In-Reply-To: 43585349859734@test.com

سام يرد عبر البريد الإلكتروني على كاميرون

يتم إرسال بريد إلكتروني إلى Discourse مع هذه الرؤوس من Gmail:
Message-ID: 5346564746574@gmail.com
References: topic/233499/101.s44r78@meta.discourse.org topic/233499@meta.discourse.org
In-Reply-To: topic/233499/101.s44r78@meta.discourse.org

Discourse (باسم سام، من البريد الإلكتروني أعلاه) ينشئ المنشور 102

يتم إرسال بريد إلكتروني إلى كاميرون من Discourse مع هذه الرؤوس:
Message-ID: topic/233499/102.s78r44@meta.discourse.org
References: 5346564746574@gmail.com topic/233499@meta.discourse.org
In-Reply-To: 5346564746574@gmail.com

بوب ينشئ المنشور 103 في الموضوع، وليس ردًا على أي شخص (لاحظ أن المراجع هنا تتضمن Message-ID المرسل إلى كلا المستخدمين للبريد الإلكتروني للموضوع الأصلي)

يتم إرسال بريد إلكتروني إلى كاميرون مع هذه الرؤوس:
Message-ID: topic/233499/103.s999r44@meta.discourse.org
References: topic/233500@meta.discourse.org topic/23499.s25r44@meta.discourse.org
يتم إرسال بريد إلكتروني إلى سام مع هذه الرؤوس:
Message-ID: topic/233499/103.s999r78@meta.discourse.org
References: topic/233499@meta.discourse.org topic/23499.s25r78@meta.discourse.org

كاميرون يرد عبر البريد الإلكتروني

يتم إرسال بريد إلكتروني إلى Discourse مع هذه الرؤوس من mutt:
Message-ID: 6759850728742572@test.com
References: topic/233499@meta.discourse.org topic/233499/103.s999r44@meta.discourse.org
In-Reply-To: topic/233499/103.s999r44@meta.discourse.org

صندوق وارد كاميرون

مارتن - الموضوع الأصلي
تم الإرسال → إلى: Discourse، إعادة: الموضوع الأصلي
سام - رد على المنشور الثاني
بوب - رد في الموضوع ليس على أي منشور معين
تم الإرسال → إلى: Discourse، إعادة: منشور بوب

صندوق وارد سام

مارتن - الموضوع الأصلي
كاميرون - المنشور الثاني
تم الإرسال → إلى: Discourse، إعادة: المنشور الثاني
بوب - رد في الموضوع ليس على أي منشور معين

أعتقد أن هذا صحيح، هل يمكنك فقط مراجعة ما كتبته في هذه الرؤوس والتحقق مما إذا كان هذا هو ما تتوقعه من هذا السيناريو؟ الشيء الوحيد الذي لست متأكدًا منه قليلاً هو ما إذا كنت قد غطيت جميع المراجع، وبالطبع سأقوم باختبار ذلك على مجموعة حية من رسائل البريد الإلكتروني في فرع تطوير قبل طرحه. لم أختبر أي شيء في mutt بعد أيضًا.

كملاحظة جانبية، نظرت أيضًا في ما تفعله GitHub مع رسائل البريد الإلكتروني الخاصة بالإشعارات الخاصة بها، ولاحظت أنها تفعل شيئًا مشابهًا حيث لديها Reference دائم (discourse/discourse/pull/252@github.com) يتم استخدامه في جميع رسائل البريد الإلكتروني المتعلقة بهذا “الموضوع” والذي هو في هذه الحالة طلب سحب GitHub:

References: \u003cdiscourse/discourse/pull/252@github.com\u003e \u003cdiscourse/discourse/pull/252/issue_event/7042100517@github.com\u003e
In-Reply-To: \u003cdiscourse/discourse/pull/252/issue_event/7042100517@github.com\u003e

cameron-simpson · 23 يوليو 2022، 12:44ص

By Martin Brennan via Discourse Meta at 22Jul2022 06:34:

Okay this is kind of huge, please bear with me. First, thanks for
another detailed reply and the offer of debugging / review, it is
really helpful I’ve actually been looking into this this morning
and, surprisingly, the threading in a unified view works in Thunderbird
for most cases, and I think the References header consistently
pointing to the OP helps with that (for example the topic Reference
in this chain which is always present is
<topic/53@discoursehosted.martin-brennan.com>.

I’ve just reread RFC5322 section 3.6.4 closely. It has moved on from
earlier versions (822 and 2822), and has merged the email In-Reply-To
headers, USENET References headers and modern
reply-citing-more-that-one previous messages.

The short summary:

The Message-ID is a single persisent identifier for a message
The In-Reply-To contains all the message-ids of which this message
is a direct reply, so if I reply to a pair of messages it will have
those 2 message-ids
The References is a reply chain of antecedant message-ids from the
OP to the preceeding message. So indeed it should always start with
the OP message-id.

So for a discussions like this, pretending that labels are message-ids:

OP
  -> reply1
    -> reply2 ---+
  -> reply3      |
    -> reply4    |
      -> reply5 <+

The reply5 would have:

message-id=reply5
in-reply-to=“reply2 reply4”
references=“OP reply3 reply4”

It is also leagel to include “reply1 reply2” in the references (the
other chain to reply5) but the RFC explicitly recommends against that
becaause some clients expect the references to be a single linear chain
of replies, not some flattened digraph.

So my recommendation for constructing the references is to use the
references of the “primary” antecedant message with the primary
antecedant message’s message-id appended. That way you always get a
linear chain in the correct order.

image3042×492 188 KB

Interestingly there seems to be some threading there.

But notice: the top post has a little “is a reply” arrow. Even though it
is post 1. I expect that is because of the “topic” references entry,
which make TB think there was a earlier message (which of course there
was not).

In mutt-land we see almost no threading at all:

23Jul2022 06:24 Olha via Discus - ┌>[Py] [Users] I need an advise  discuss-users 5.7K
22Jul2022 17:12 Paul Jurczak vi - ├>[Py] [Users] I need an advise  discuss-users 5.5K
22Jul2022 13:21 Rob via Discuss - ├>[Py] [Users] I need an advise  discuss-users 6.8K
22Jul2022 12:53 vasi-h via Disc - ├>[Py] [Users] I need an advise  discuss-users 5.5K
22Jul2022 11:38 Cameron Simpson - ├>[Py] [Users] I need an advise  discuss-users  14K
22Jul2022 10:27 Rob via Discuss - ├>[Py] [Users] I need an advise  discuss-users 6.6K
22Jul2022 06:14 vasi-h via Disc r ┴>[Py] [Users] I need an advise  discuss-users 6.5K

which is because every message’s In-Reply-To points directly at the
fictitious “topic” message-id. Mutt probably ignores the References
because it is a mail reader, and References originates in USENET news.
Maybe Thunderbird is using the references or augumenting the in-reply-to
with references information.

You only need to consult one of In=-Reply-To or References to do
threading; the former comes from email and the latter from USENET.
You’re supporting both (which is great!) so we need to make them
consistent.

(Aside: there’s also discussion about USENET mirroring, because several
python people consume the lists via a USENET interface. Again, a
separate topic.)

[…]

[quote=“Cameron Simpson, post:8, topic:233499,
username:cameron-simpson”]
This looks fine. Does it save this id with the db record so that inbound
replies can be tied to the antecedant forum message?
[/quote]

The answer is – it depends. If a post is created in Discourse from an inbound email, such as this one of yours, we use that post’s original inbound Message-ID when someone replies to it for the In-Reply-To and References headers as per:

discourse/lib/email/sender.rb at 98bacbd2c6b9fe57167cd32af5eb4839b4a5d1f6 · discourse/discourse · GitHub

Otherwise we are just using the topic OP reference and just generating a new reference, which obviously is what is causing all the issues. In all cases we generate a new Message-ID every time an outbound email is sent, which seems correct and on par with other mail clients.

Alas, not quite. If you’re the origin of the message (i.e. authored in
Discourse), generating the message-id is fine. If there’s no message-id
(illegal) generating one is standard practice (usually by MTAs). But if
you’re passing a message on (authored in email), the existing message-id
should be preserved.

To my mind you need to be doing 3 things:

having a stable message-id and not replacing the message-id from an
inbound message
generating correct In-Reply-To, which is easily computed from the
immediate antecedant message(s) i.e. antecedant(s)-Message-ID
generating correct References, which is easily computed as
antecedant-References + antecedant-Message-ID

For point 1, looking at the code you cite, you probably want the email
message id to be (Pythonish syntax, sorry):

def message_id(post):
    return post.incoming_email.message_id or discourse_message_id(post)

i.e. to be the post’s email message-id if it originated from email,
otherwise the Discourse message-id using something like the algorithm
you outline later in this message: anything (a) stable and (b)
syntacticly valid.

Then computing the In-Reply-To and References fields is simple
mechanical stuff as in points 2 and 3.

Cameron Simpson:

This second commit seems to address another problem we have seen: if a
post comes from an email, the outbound message-id sent to email users is
not the message-id of the source message from the author. This results
in two different messages from the point of view of a mail client, and
probably breaks replies made to the original as opposed to the
forum-sent copy.

I think I see what you mean, does it go like this:

cameron sends email to Discourse from mutt which gets Message-ID: 74398756983476983@mail.com

Discourse creates a post and stores the Message-ID with against the post with an IncomingEmail record

Correct.

johndoe is watching the topic, so they get sent an email from Discourse with a Message-ID: topic/222/44@discourse.com and no reference to the original Message-ID: 74398756983476983@mail.com

No. You really want to pass through IncomingEmail.message_id as the
Message-ID in the email to johndoe. It’s the same message.

Does that sound correct, that we should just “pass on” that Message-ID to those watching the topic instead of generating our own since it’s already unique? What then happens in johndoe’s mail client if
cameron also CC’d him on that original outbound message? This does sound like a separate issue so it would be good to open another bug topic for it.

By passing it on, the original message (cameron->cc:johndoe) and the
Discourse forwarded message (cameron->Discourse->johndoe) have the same
message-id and the same message contents. The receiving mail system
stores both. The mail reader sees both, and either presents both or
keeps just one (this is a policy decision of the mail reader - keeping
just one is common). Because they’re the same message, in general it
does not matter which is kept.

If we ignored discourse and considered a message which was
a copy of the message via the list and also via direct email. They’re
the same message, with the same message-id.

Cameron Simpson:

I and several others use mutt.

I will set up a mutt client locally to see what you are also seeing, I have never tested this functionality in a text-based client (only Gmail and Thunderbird) so I am keen to see how it looks anyway.

Happy to help with settings. For threaded view you need to set the
sorting to threadeed. Mutt is very configurable.

My line of thinking to address these issues this morning was to dispose
with the randomly generated suffixes generated when we send
Message-ID headers in emails and instead change to a scheme where we
use the user_id of both the sending and receiving user. The benefit
of this is that there is no need to store the Message-ID anywhere
(apart from when an inbound email creates a post) and so References
and In-Reply-To headers will always be consistent.

Yes, that is much better. Noting that the inbound email message-id
should override the Discourse derived message-id for the outbound email.

(Most mail systems use random strings because there’s no surrounding
context such as the discourse topic message structure - messages are
considered alone; but the only real requirement is persistent
uniqueness.)

Let me give an example. Say we have these users:

martin - user_id 25

cameron - user_id 44

sam - user_id 78

bob - user_id 999

And then we have this topic, topic_id 233499, with posts starting from post_id 100 as the OP. The format would become topic/#{topic_id}/#{post_id}.s#{sender_user_id}r#{receiver_user_id}.

The order of operations would look like this:

martin creates the OP

cameron is sent an email with these headers:

Message-ID: topic/233499.s25r44@meta.discourse.org

References: topic/233499@meta.discourse.org

sam is sent an email with these headers:

Message-ID: topic/233499.s25r78@meta.discourse.org

References: topic/233499@meta.discourse.org

There should not be a References header in the OP. It isn’t
needed for threading and effectively pretends there’s some “post 0”
which doesn’t exist. It meeans every OP (a) looks like a reply, which it
is not and (b) looks like the thing to which it is a reply is missing
from the reader’s mailbox.
This makes different message-ids for each outbound copy of the OP.
That’s bad. They need to be the same. Supposing sam CCs cameron
directly in a reply. The In-Reply-To will cite a mesage-id cameron
has never received.

You can just drop the sender_user_id and receiver_user_id from the
message-id field and get a single unique id which every receiver sees.

The uniqueness constraint is the post itself, not the individual
email-level “message” object.

Re the References, the OP should not have one. TB and everything else
will be fine. If they’re threading using References instead of
In-Reply-To, the References in the reply messages are enough.

Here’s the start of a mailing list discussion thread in Mutt:

16Jul2022 01:09 Rob Boehne      - │├>[Python-Dev] Re: [SPAM] Re: Swit python-dev 9.2K
16Jul2022 01:33 Peter Wang      - │├>                                 python-dev 3.0K
16Jul2022 00:24 Skip Montanaro  - ├>[Python-Dev] Re: Switching to Dis python-dev 4.2K
16Jul2022 04:49 Erlend Egeberg  - ├>[Python-Dev] Re: Switching to Dis python-dev  10K
16Jul2022 04:20 Mariatta        - ├>[Python-Dev] Re: Switching to Dis python-dev  10K
15Jul2022 21:18 Petr Viktorin   - [Python-Dev] Switching to Discourse python-dev 4.2K

Ignore that I sort my email newest-on-top. See that there’s no arrow on
the initial post (at the bottom). That messgae has no References and
no In-Reply-To. All the others have In-Reply-To (and possibly
References, but this is an email mailing list so not necessarily; as I
mentioned before they’re complimentary.)

If I repeat my Discourse example from earlier:

23Jul2022 06:24 Olha via Discus - ┌>[Py] [Users] I need an advise  discuss-users 5.7K
22Jul2022 17:12 Paul Jurczak vi - ├>[Py] [Users] I need an advise  discuss-users 5.5K
22Jul2022 13:21 Rob via Discuss - ├>[Py] [Users] I need an advise  discuss-users 6.8K
22Jul2022 12:53 vasi-h via Disc - ├>[Py] [Users] I need an advise  discuss-users 5.5K
22Jul2022 11:38 Cameron Simpson - ├>[Py] [Users] I need an advise  discuss-users  14K
22Jul2022 10:27 Rob via Discuss - ├>[Py] [Users] I need an advise  discuss-users 6.6K
22Jul2022 06:14 vasi-h via Disc r ┴>[Py] [Users] I need an advise  discuss-users 6.5K

See they all have a leading arrow? That is because the mail client
believes they are all replies to a common (and missing) root message,
which is because of the “topic” message-id in the References header.
Whereas post 1 is actually the bottom message displayed above.

Summary:

your plan is good, provided you drop the sender and receiver from the
message-id - they’re unnecessary and in fact the receiver will cause
trouble (the sender is just redundant).
drop the “topic” pseudo-message-id from the References - it misleads
email clients (including TB, even if it isn’t visually evident)

cameron replies via email

discourse is sent an email with these headers from mutt:

Message-ID: 43585349859734@test.com

References: topic/233499@meta.discourse.org topic/233499.s25r44@meta.discourse.org

In-Reply-To: topic/233499.s25r44@meta.discourse.org

Yes, again with the caveat that there should not be a “topic” reference.
As expected, there is a reference to the OP message-id. Though it should
be the same message-id that sam sees for the OP.

discourse (as cameron, from the above email) creates post 101

sam is sent an email from discourse with these headers:

Message-ID: topic/233499/101.s44r78@meta.discourse.org

References: 43585349859734@test.com topic/233499@meta.discourse.org

In-Reply-To: 43585349859734@test.com

And here it goes wrong. The Message-ID should be
43585349859734@test.com from the .incoming_post.message_id field.
(Well, in my mind this is post.message_id(), which returns
post.incoming_post.message_id for an email generated post and your
Discourse generated one otherwise).

Consider: I compose and send my reply with message-id
43585349859734@test.com. For continuity reasons, I keep a copy of that
in my local folder, where it shows as a reply to the OP. Ideally
Discourse also sends me a copy of my own post (this is a policy setting
on many mailing lists), so I get Discourse’s version also. That should
have the same message-id, because it is the same message, just via a
different route.

Discourse’s message is not “in reply to” my message. It is my
message, just forwarded.

This effect cascades through your following examples. The actual process
should be simpler than you’ve made it.

Think of it this way. If I reply to a post from email, it effectively is
like me emailing sam (and the others) via Discourse. Discourse
forwards my message to the email-receiving subscribers, and
“incidentally” keeps a copy on the forum

As a side note, I also looked into what GitHub do with their
notification emails, and noticed they do a similar thing where they
have an ever-present Reference
(discourse/discourse/pull/252@github.com) that is used in all the
emails related to that “topic” which in this case is a GitHub pull
request:
References: <discourse/discourse/pull/252@github.com> <discourse/discourse/pull/252/issue_event/7042100517@github.com>
In-Reply-To: <discourse/discourse/pull/252/issue_event/7042100517@github.com>

Hoo, github. What a disaster their issue emails are

However, in their scenario, the PR is the OP. So a reference directly
to the pull is sane. You could use the “topic” message-id for post 1,
provided you didn’t also use the “topic/1” id as well. But there seems
little point - it is extra effort to special case post 1 - I’d just use
“topic/1” myself.

To add some complication. As I understand it, an admin can move a post
or topic. Doesn’t that break the “generate the message-id” scheme,
particularly if they move just a post? I’m somewhat of the opinion that
every post should have a _message_id field, filled in from the
incoming message (from email) or generated (posting via Discourse). Then
it is persistent and stable and robust against any shuffling of posts or
changes of algorithm.

Finally, there’s a small security consideration: you should ignore the
inbound email message-id (and potentially bounce the message) if it
claims the message-id of an existing post. Since as an author, I can put
anything I like in that header I’d go with just dropping the
message-id - accept the post, but don’t let it lie about being some
other post - give your copy the Discourse-generated id and then proceed
as normal.

martin · 25 يوليو 2022، 12:18ص

شكرًا لك مرة أخرى على هذه الاستجابة المتعمقة والرائعة. سيستغرق الأمر بعض الوقت لمعالجة هذا وتحويله إلى عناصر قابلة للتنفيذ، لذا يرجى التحلي بالصبر معنا (بالإضافة إلى ذلك، لدي بعض المشاريع الداخلية الأخرى ذات الأولوية القصوى التي أعمل عليها حاليًا). أعتقد أنه بهذه المعلومات سنتمكن من جعل أنظمة الترابط لدينا أكثر قوة وتوافقًا مع المواصفات. قد تكون لدي أسئلة أخرى أثناء مراجعتي لمنشورك، شكرًا لك كاميرون.

cameron-simpson · 25 يوليو 2022، 2:59ص

بواسطة مارتن برينان عبر Discourse Meta في 25 يوليو 2022 00:28:

شكرًا جزيلًا مرة أخرى على هذه الاستجابة المتعمقة والرائعة. سيستغرق الأمر بعض الوقت لمعالجتها وتحويلها إلى عناصر قابلة للتنفيذ، لذا يرجى التحلي بالصبر معنا (بالإضافة إلى ذلك، لدي بعض المشاريع الداخلية الأخرى ذات الأولوية القصوى التي أعمل عليها حاليًا). أعتقد أنه بهذه المعلومات سنتمكن من جعل أنظمة الترابط لدينا أكثر قوة وتوافقًا مع المواصفات. قد تكون لدي أسئلة أخرى أثناء مراجعتي لمنشورك، شكرًا لك كاميرون.

بالتأكيد. تحياتي، كاميرون سيمبسون

cameron-simpson · 25 يوليو 2022، 3:02ص

بالمناسبة، لاحظت أن هذه المشاركة اللاحقة منك تحتوي على هذه الرؤوس:

Message-ID: <topic/233499/1137586.d14eea2849d76c355ec214fb@meta.discourse.org>
In-Reply-To: <YttEVzlTh/ymDSPT@cskk.homeip.net>
References: <topic/233499@meta.discourse.org>
      <YttEVzlTh/ymDSPT@cskk.homeip.net>

أي أنها حافظت على معرف الرسالة الأصلي الخاص بي. لذا فإن In-Reply-To صحيح، و References على الأقل يحتوي على معرف الرسالة الخاص بي.

لم يكن هذا ما كنا نلاحظه في discuss.python.org.

تحياتي،
Cameron Simpson

martin · 26 يوليو 2022، 12:17ص

آه، هذه ملاحظة مثيرة للاهتمام، لم ألاحظ السهم الصغير.

هذا مثير للاهتمام للغاية أيضًا. أعتقد (دون فحص المصدر) أن Thunderbird يفعل ذلك، ومن المحتمل أن واجهة Gmail تفعل الشيء نفسه لأنها تفعل الشيء نفسه.

نحن نفعل هذا على ما يبدو ولكن ليس باستمرار؟ بشكل أساسي نحتاج إلى التأكد من أن:

TODO #1 - إذا كان للمنشور سجل IncomingEmail مرتبط به، فإننا نستخدم دائمًا Message-ID هذا عند إرسال البريد الإلكتروني.

TODO #2 - لا تستخدم References عند إرسال رسائل بريد إلكتروني متعلقة بـ OP للموضوع. @cameron-simpson سؤال واحد رغم ذلك - إذا تم إنشاء OP عبر بريد إلكتروني وارد، فهل نستخدم Message-ID هذا في References لـ OP أم لا نزال نستبعده؟

هذا مثير للاهتمام، اعتقدت أن كل مستلم للبريد الإلكتروني يجب أن يكون لديه Message-ID فريد؟ في الواقع أعتقد أن هذا هو السبب في أننا سلكنا مسار إضافة التفرد إلى Message-ID لكل مستلم، لتجنب سلوكيات البريد العشوائي، بالنظر إلى موضوعنا الداخلي. ربما @supermathie ، الذي هو في فريق البنية التحتية لدينا وكان يقوم بالكثير من الاختبارات مع البريد الإلكتروني في وقت سابق من هذا العام، يمكنه أيضًا إبداء رأيه هنا؟

ما تقوله هو أن المنشور هو الذي يحدد Message-ID واحدًا لجميع المستلمين. لذا ربما ننشئ واحدًا لكل منشور ينشئ بريدًا إلكترونيًا؟ ثم يمكننا أيضًا نقل IncomingEmail.message_id إلى هنا. بشكل مبدئي، التغيير الذي سنحتاج إلى إجرائه هو:

TODO #3 - أضف outbound_message_id إلى جدول Post. أنشئه مرة واحدة عند إرسال بريد إلكتروني لأول مرة فيما يتعلق بالمنشور. استخدمه لرؤوس References و In-Reply-To اللاحقة. قم بتعيين قيمته عند إنشاء منشور من IncomingEmail. يجب أن يكون التنسيق topic/:topic_id/:post_id/:random_alphanumeric_string@host على سبيل المثال topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org

بعد هذا التغيير، سيصبح مثالي الأول كالتالي:

martin ينشئ OP

يتم إرسال بريد إلكتروني إلى cameron مع هذه الرؤوس:
Message-ID: topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org
يتم إرسال بريد إلكتروني إلى sam مع هذه الرؤوس:
Message-ID: topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org

مع الأخذ في الاعتبار أيضًا أن OP لا يحصل على معالجة خاصة، فلن يكون بالتنسيق topic/:topic_id@hostname بعد الآن.

TODO #4 - تأكد من إنشاء رؤوس In-Reply-To و References الصحيحة بناءً على سجلات PostReply وعمود outbound_message_id الجديد في جدول Post

أعتقد أن لدينا بعض الاعتبارات لهذا، سأتحقق مرة أخرى.

يبدو بالتأكيد كذلك :ابتسامة عرق:

هل يمكنك تأكيد أن مهام TODO هنا تبدو معقولة يا كاميرون؟ لا يبدو الأمر كثيرًا حقًا الآن بعد أن نظرت إليه. أتساءل أيضًا، عندما أصل إلى هذا العمل، هل ستكون منفتحًا للانضمام إلى مثيل Discourse اختباري معي سيتم نشر التغييرات قيد التقدم عليه حتى نتمكن من إرسال رسائل بريد إلكتروني ذهابًا وإيابًا واختبار أن الأمور تعمل بشكل صحيح؟ سأقوم بالطبع بإجراء اختبارات خاصة بي قبل إشراكك.

إذا لم يكن الأمر كذلك، فلا بأس - لدي Thunderbird وسأقوم بإعداد mutt ويمكنني اختبار كل شيء هناك

sam · 26 يوليو 2022، 2:24ص

@cameron-simpson أردت توضيح شيء واحد هنا وهو نطاق “message_id”.
الشيء الذي بدأ كل هذه الرقصة كان شكًا قويًا من @supermathie في أن معرفات الرسائل غير الفريدة لدينا كانت تسبب مشاكل.
يقوم Discourse بإنشاء رسائل بريد إلكتروني فريدة لكل مستخدم لكل بريد إلكتروني يرسله. لذا، على سبيل المثال، لنفترض أن مستخدمين اثنين يتابعان هذا الموضوع:

يحصل المستخدم 1 على الحمولة 1 برابط إلغاء اشتراك مميز موجه إلى المستخدم 1
يحصل المستخدم 2 على الحمولة 2 برابط إلغاء اشتراك مميز موجه إلى المستخدم 2
إذا كان معرف رسالتنا في كلتا الحالتين هو discourse_topic_100/23 (على سبيل المثال topic_id/post_number)، فسنخبر MTAs هناك بأن discourse_topic_100/23 يمكن أن يكون حمولتين مختلفتين، والفرضية هي أنهم يعاملون هذا كإشارة بريد عشوائي.

مرحبًا Discourse … لقد أرسلت للتو بريدين باسم discourse_topic_100/23، ما الأمر؟

نظرًا لأن Discourse يتحكم في جميع نقل البريد الإلكتروني ولا تتم إضافة رسائل البريد الإلكتروني إلى قائمة BCC أو CC مثل قوائم البريد التقليدية، يمكننا تحمل تكاليف روابط إلغاء اشتراك نظيفة لكل مستخدم.
ما هي أفكارك هنا؟ ماذا عن التغيير البسيط لاستخدام discourse_topic_100/23/7333 على سبيل المثال (topic_id، post_number، user_id) كمعرف فريد للبريد، فهو بالتأكيد حمولة فريدة ويمكننا الإشارة إليه بسهولة عند إنشاء رسائل بريد إلكتروني للمستخدم.

cameron-simpson · 26 يوليو 2022، 2:46ص

By Martin Brennan via Discourse Meta at 26Jul2022 00:27:

[quote=“Cameron Simpson, post:11, topic:233499,
username:cameron-simpson”]
Mutt probably ignores the References
because it is a mail reader, and References originates in USENET news.
Maybe Thunderbird is using the references or augumenting the in-reply-to
with references information.

You only need to consult one of In=-Reply-To or References to do
threading; the former comes from email and the latter from USENET.
You’re supporting both (which is great!) so we need to make them
consistent.
[/quote]

This is also super interesting. I believe (without examining the source) Thunderbird does do that, and likely the Gmail UI as well since it does the same thing.

I think mutt will use both, but probably just In-Reply-To if present,
falling back to References. I’d need to check the source.

With References you do at least know the full chain to the OP; with
In-Reply-To you more or less need the antecedant messages around to
stitch things together. For mailing lists I usually keep the whole
thread locally until it’s done anyway, and I expect that is common.

Cameron Simpson:

But if
you’re passing a message on (authored in email), the existing message-id
should be preserved.

Cameron Simpson:

i.e. it has preserved my original email message-id. So the In-Reply-To
is correct, and the References at least has my email message-id in it.

Cameron Simpson:

No. You really want to pass through IncomingEmail.message_id as the
Message-ID in the email to johndoe. It’s the same message.

We do seem to be doing this but I guess not consistently? Basically we need to make sure that:

TODO #1 - If a post has an associated IncomingEmail record, we always use that Message-ID when sending email.

Yes. This is why I was thinking it might be sanest to have an explicit
field for the message-id, and to fill it in once. Then use that from
then on always, regardless of any changes to the process in which the
message-id is manufactured in the code later.

Cameron Simpson:

There should not be a References header in the OP. It isn’t
needed for threading and effectively pretends there’s some “post 0”
which doesn’t exist. It meeans every OP (a) looks like a reply, which it
is not and (b) looks like the thing to which it is a reply is missing
from the reader’s mailbox.

Cameron Simpson:

drop the “topic” pseudo-message-id from the References - it misleads
email clients (including TB, even if it isn’t visually evident)

TODO #2 - Do not use a References when sending out emails related to the OP of the topic .

Yes. The OP has no antecedant, so there’s no References or
In-Reply-To.

@cameron-simpson one question though – if the OP was created via an
inbound email, would we use that Message-ID in References for the
OP or still exclude it?

Still exclude. But use it as the persistent message-id for the OP.

So a message authored by email (OP or reply) gets its message-id from
the email. One authored on the web gets one when the user presses
Submit, generated by Discourse. From then on, that’s the message-id,
however created.

[quote=“Cameron Simpson, post:11, topic:233499,
username:cameron-simpson”]
You can just drop the sender_user_id and receiver_user_id from the
message-id field and get a single unique id which every receiver sees.

The uniqueness constraint is the post itself, not the individual
email-level “message” object.
[/quote]

Cameron Simpson:

your plan is good, provided you drop the sender and receiver from the
message-id - they’re unnecessary and in fact the receiver will cause
trouble (the sender is just redundant).

Cameron Simpson:

To add some complication. As I understand it, an admin can move a post
or topic. Doesn’t that break the “generate the message-id” scheme,
particularly if they move just a post? I’m somewhat of the opinion that
every post should have a _message_id field, filled in from the
incoming message (from email) or generated (posting via Discourse). Then
it is persistent and stable and robust against any shuffling of posts or
changes of algorithm.

This is interesting, I thought every recipient of the email had to have a unique Message-ID?

No. The message-id identifies the “message”. Not the individual copy. I
might post to the forum and CC someone directly. If that someone gets a
copy direct from me and also via the forum, they should have the same
message-id.

In fact I believe this is why we went down the path of adding
uniqueness to each recipient’s Message-ID, to avoid spam behaviours,
looking back on our internal topic. Perhaps @supermathie , who is on
our infra team and was doing a bunch of testing with email earlier in
the year, could weigh in here too?

Maybe. But on that face of it, threading is indeed broken. Certainly
sending the same message to many people should have the same message-id,
and generally, as a forwarder (email->discourse->email-recipients)
discourse shoud not be modifying the message-ids.

What you are saying is that it’s more that the post should be the thing determining a single Message-ID for all recipients. So perhaps we just generate one for each post that generates an email?

Every post should have one stable unique message-id for use in the email
side. If the post originated from an email, that original message-id
should be used. Otherwise (via the web interface) Discourse should be
generating a message-id and storing it with the post.

Then we could also move the IncomingEmail.message_id to here as well.

Sure. Having a distinct set of fields (message-id seems enough)
containing the email-side state should do it.

Tentatively, the change we would need to make is:

TODO #3 - **Add a outbound_message_id to the Post table. Generate
it once when an email is first sent in relation to the post.

If you got the post from an email, you should be using that, not
generating a new one.

Use if for subsequent References and In-Reply-To headers. Set its
value when a post is created from an IncomingEmail.

Yes. To the message-id from the email.

Format should be
topic/:topic_id/:post_id/:random_alphanumeric_string@host e.g.
topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org**

For ones you generate yourselves, this looks good to me.

After this change my first example would become this:

martin creates the OP

cameron is sent an email with these headers:

Message-ID: topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org

sam is sent an email with these headers:

Message-ID: topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org

Yes.

But note: the message-id only needs to be stable and unique. If the
topic/:topid_id/:post_id@host is stable and will never be regenerated,
that will do. But if you’re concerned about that (eg db restores or
migrations or imports bringing those same numbers) then the random
string will make it robust against collision.

Note that the message-id left part is dot-atom-text, defined here:

which is alphas and digits and a limited set of punctuation characters
(which includes “/”).

Um, your headers. They should have:

Message-ID: <topic/233499/33545/gvy8475y7c45y87554c@meta.discourse.org>

Note the angle brackets. The message-id is formally the bit between the
angle brackets, and the angle brackets are mandatory. Syntax here:

Cameron Simpson:

But there seems
little point - it is extra effort to special case post 1 - I’d just use
“topic/1” myself.

With the consideration also that the OP does not have special handling, it will no longer be in the format topic/:topic_id@hostname.

Sounds good.

Cameron Simpson:

generating correct In-Reply-To, which is easily computed from the
immediate antecedant message(s) i.e. antecedant(s)-Message-ID

generating correct References, which is easily computed as
antecedant-References + antecedant-Message-ID

TODO #4 - Ensure that correct In-Reply-To and References headers are generated based on PostReply records and the new outbound_message_id column on the Post table

Thanks.

Cameron Simpson:

Finally, there’s a small security consideration: you should ignore the
inbound email message-id (and potentially bounce the message) if it
claims the message-id of an existing post. Since as an author, I can put
anything I like in that header I’d go with just dropping the
message-id - accept the post, but don’t let it lie about being some
other post - give your copy the Discourse-generated id and then proceed
as normal.

I think we have some consideration for this, I will double-check.

+1

Cameron Simpson:

This effect cascades through your following examples. The actual process
should be simpler than you’ve made it.

It definitely seems that way

Can you confirm the TODOs here sound reasonable Cameron?

They seem correct to me.

It really doesn’t seem like much now that I look at it. I also wonder,
when I get to this work would you be open to joining a testing
Discourse instance with me that will have the WIP changes deployed to
it so we can email back and forth and test that things are working
correctly? I will of course do testing of my own before I involve you.

Certainly. Happy to help in whatever way.

If not, that’s fine too – I have Thunderbird and will be setting up
mutt and I can test it all out there

I can help you with mutt if you want it too.

cameron-simpson · 26 يوليو 2022، 3:50ص

أعتقد أنه لا يزال بإمكانك إرسال رسائل مميزة بنفس معرف الرسالة، حتى مع وجود اختلافات طفيفة كهذه.

القوائم البريدية العادية تفعل هذا طوال الوقت بدرجات متفاوتة. على الأقل، يحدث بعض التلاعب بالرؤوس دائمًا. ولكن يتم تعديل نص الرسالة أحيانًا أيضًا. مثال صارخ هو python-list، الذي يتجاهل المرفقات غير النصية. تمر الرسالة بنفس معرف الرسالة. وتقريبًا كل القوائم تضيف ملاحظة في الأسفل مع، على سبيل المثال، رابط لصفحة إدارة القائمة أو رابط إلغاء الاشتراك. لن يكون ذلك موجودًا في الرسالة عند وصولها.

وكانت هناك مناقشات طويلة حول توقيع المحتوى والتي تتمحور حول ما يجب تغطيته بتوقيع.

لذلك، سأكون على ما يرام تمامًا مع إضافة رابط إلغاء الاشتراك الخاص بالمستلم مع الحفاظ على معرف الرسالة الأصلي. الفوائد تفوق بكثير فقدان الترابط إذا أعطيت كل نسخة من الرسالة معرف رسالة فردي.

مرة أخرى، فكر في مستخدم البريد الإلكتروني. يمكنني الرد على رسالة مناقشة وإضافة نسخة إلى شخص خارجي مهتم. ربما يتلقى نسخة من المناقشة، وربما لا. ولكن إذا فعل ذلك، فيجب أن يكون معرف الرسالة المصدر عليها حتى مع إضافة الملاحظة الإضافية الخاصة بك. وإلا فسيكون لديه نسختان من رسالتي، لكن نظام البريد الخاص به لا يعرف أنهما نسختان من نفس الرسالة. يحدث سوء.

لذا باختصار: لا أعتقد أن نص إلغاء الاشتراك الإضافي البسيط الخاص بك يستدعي معرفات رسائل مميزة. احتفظ بواحد فقط.

supermathie · 27 يوليو 2022، 1:56م

عذرًا، أنا فقط ألحق بالركب الآن، إليك بعض الأفكار، بعضها تم تناوله بالفعل…

الصعوبة هنا هي أن ما يتم إرساله خارج من Discourse هو رسالة مختلفة عن الرسالة الواردة. لديها بيانات وصفية مختلفة (لهذا الغرض، إلى/من/رد على/إلغاء الاشتراك/إلخ) وجسم مختلف (يتم تخصيصه لكل مستخدم (أعتقد؟ ألا يحدث هذا في وضع القائمة البريدية؟)).

ما هي الرسالة بالضبط؟ بالنظر إلى 5322 كحقيقة مطلقة:

تتكون الرسالة من حقول رأس، يتبعها اختياريًا جسم الرسالة.

يوفر حقل “Message-ID:” معرف رسالة فريد يشير إلى إصدار معين من رسالة معينة.

[التشديد لي]

هذا “الإصدار المعين” هو ما يجعلني أعتقد أنه سيكون من غير المناسب إعادة إرسال رسالة واردة بمعرف رسالة مختلف. على الرغم من ذلك، إذا غيرت وجهة نظرك من Discourse كـ “برنامج منتدى” إلى Discourse كـ “برنامج قائمة بريدية” فإن ذلك نوعًا ما منطقي، لذلك أفهم وجهة نظرك. يقول 5322 أيضًا:

هناك العديد من الحالات التي يتم فيها “تغيير” الرسائل، ولكن هذه التغييرات لا
تشكل تجسيدًا جديدًا لتلك الرسالة، وبالتالي لن تحصل الرسالة على معرف رسالة جديد. على سبيل المثال، عندما يتم إدخال الرسائل في نظام النقل، غالبًا ما يتم إضافة
حقول رأس إضافية إليها مثل حقول التتبع (الموصوفة في القسم 3.6.7)
وحقول إعادة الإرسال (الموصوفة في القسم 3.6.6). لا يؤدي إضافة حقول الرأس هذه إلى تغيير هوية الرسالة وبالتالي يتم الاحتفاظ بـ “Message-ID:” الأصلي. في جميع الحالات، فإن معنى
المرسل للرسالة الذي يرغب في نقله (أي، ما إذا كانت هذه هي نفس
الرسالة أو رسالة مختلفة) هو ما يحدد ما إذا كان حقل “Message-ID:” يتغير أم لا، وليس أي اختلاف تركيبي معين يظهر (أو لا يظهر) في الرسالة.

أفترض أن الأمر يعود إلى ما إذا كان مرسل الرسالة يتغير عندما يرسلها Discourse؟

ربما يجب أن نستخدم Resent-Message-ID والأصدقاء؟

لقد كان موجودًا دائمًا، منذ 822. ولكن كما تقول لاحقًا، نعم لقد تم تحديثه.

يتحدث 5322 أيضًا مباشرة عن الطريقة التي يستخدم بها Discourse و Github:

قد يتم استخدام حقل “In-Reply-To:” لتحديد الرسالة (أو الرسائل) التي ترد عليها الرسالة الجديدة، بينما قد يتم استخدام حقل “References:” لتحديد “سلسلة” محادثة.

ربما بشكل غير صحيح قليلاً، على الأرجح بسبب عدم وجود “معرف سلسلة” مناسب. ولكن قد لا يكون هذا التفسير هو ما قصده مؤلفو RFC… فهو لا يعالج الرسائل التي تحتوي على “References” ولكن بدون “In-Reply-To”.

الجزء الصعب في هذا هو أننا لا نرسل بريدًا إلكترونيًا واحدًا، بل نرسل N - واحد لكل مستلم - حتى تتمكن بياناتهم الوصفية الفردية (إلغاء الاشتراك، إلخ) من أن تكون صحيحة.

ونعم، رأيت مؤشرات قوية أثناء الاختبارات بأن تحديد البريد العشوائي سيرتبط بمعرف الرسالة. إذا تمت رؤيته لاحقًا (نفس المستخدم أو مستخدم مختلف) فمن المرجح جدًا أن يتم تمييزه كبريد عشوائي.

الفوائد هنا، بصراحة، تتعلق بالكامل بترتيب رسائل البريد الإلكتروني بشكل صحيح في بعض عملاء البريد على حساب قابلية التسليم.

topic/#{topic_id}/#{post_id}.s#{sender_user_id}r#{receiver_user_id} الحالي على الأقل يجعله متسقًا للمستخدم في صندوق الوارد الخاص به. الافتراض

أكبر مخاوفي هو قابلية التسليم - من الصعب بما يكفي تسليم البريد الإلكتروني عندما لا يكون هناك أي رؤية من مقدمي الخدمة الرئيسيين.

لكنني أرى حجة قوية لجعل Discourse يتصرف بشكل أكبر مثل برامج القوائم البريدية في وضع القائمة البريدية. @martin أعتقد أننا لا نقوم بتخصيص جسم الرسالة في وضع القائمة البريدية؟ هل تعتقد أنه من المنطقي اتخاذ نهج أكثر صرامة حول الحفاظ على معرفات الرسائل وإعادة استخدامها في وضع القائمة البريدية؟

sam · 27 يوليو 2022، 9:22م

لا أريد أن أكون في موقف يكون فيه الكمال عدوًا لما يكفي.

github.com/discourse/discourse

lib/email/message_id_service.rb

368251347


      
          "<topic/#{topic.id}.#{random_suffix}@#{host}>"

نحن نستخدم “لاحقة عشوائية” الآن في الرسائل وهذا يسبب بلا شك ألمًا.

لدينا 3 خيارات مطروحة:

معرفات رسائل عشوائية لا يمكن الرجوع إليها
معرفات رسائل ثابتة لكل موضوع/منشور/مستخدم
معرفات رسائل ثابتة لكل زوج موضوع/منشور

نحن حاليًا في الكوكب (1) الذي يسبب دمارًا.

أخشى أن نصل إلى شلل في اتخاذ القرار بين (2) و (3).

ربما نبدأ ببساطة بـ (2) مع الاعتراف بأن إضافة نسخ إضافية إلى بريد إلكتروني من Discourse قد تسبب سلوكًا غير متوقع، ونتوقف على الأقل عن غالبية الألم هنا؟

supermathie · 27 يوليو 2022، 10:27م

آه! اعتقدت أننا كنا نقوم بالفعل بـ: topic/#{topic_id}/#{post_id}.s#{sender_user_id}r#{receiver_user_id}

كنت سأميل إلى، لصالح موازنة مخاوف تفرد البريد الإلكتروني وقابلية التسليم مقابل مخاوف وضع القائمة البريدية، القيام بـ (2) لوضع القائمة البريدية معطلاً و (3) لوضع القائمة البريدية ممكّنًا.

وبالمثل، مع ترويسة References، كنت سأميل إلى جعلها غائبة للمنشور رقم 1 في موضوع ما والإشارة إليه (لذا topic/#{topic_id}) والمنشور الذي يرد عليه، إن وجد.

الموضوع		الردود	مرات العرض
Discourse Emails not threaded properly in some Email clients Support	13	4955	16 يونيو 2022
Emails are not threaded in Outlook 2013 Bug	31	14485	9 يناير 2015
Threading for email-only topics seems broken Support	7	1239	24 أكتوبر 2023
Email-in replies thread wrongly Bug	18	6519	23 يونيو 2017
Email threading broken Bug	8	774	29 يوليو 2022

يبدو بالتأكيد كذلك :ابتسامة عرق:

الموضوعات ذات الصلة