Missing space characters in activity summary email text

I’ve just sent myself some preview summaries, since I don’t normally get them.

I’m seeing some random—but consistent—missing space characters in headings and text within the email. These spaces are not missing in the forum content, but the same ones are consistently dropped in multiple generated preview summaries, as viewed in various email clients.

I’ve tried deleting and re-adding the original spaces to no effect.

Excerpts:

digest_missing_spaces

I’ve reviewed the summaries I receive from some other Discourse forums, and I don’t see this anywhere else.

Has anyone else seen this, or have an idea what’s happening?

Is it possible this could be a font / display problem? Have you checked the underlying raw content?

Hm. I’m not sure how to diagnose a font/display problem. The emails render the same in multiple mail clients and browsers across Windows and Linux.

I’ve attached .json to the forum post URLs, and there’s nothing off about the “topic_slug” or “cooked” content…

Is there something else I could be checking for in the raw content?

1 Like

You’ll need to check the raw email rather than the post.

1 Like

Ok - l looked at the raw email, and where the HTML version is missing spaces, the text version has the correct spaces. However, the text version is missing other space characters. There’s no rhyme or reason to it.

Maybe it could be a character-encoding glitch in the course of copy/pasting the affected topics from a legacy platform…? EDIT: Nope. It’s continuing with current posts, and also with other emails—not just the summary.

More recent Discourse summaries featuring current posts are not having the same issue, so I won’t worry much unless I see it continue. EDIT: It’s continuing.

(Side note: just to watch for this stuff, I now wish I could force a comprehensive summary sent to my admin account on a daily basis – regardless of me being constantly logged in.)

Could you forward one of those emails to me as an attachment?

EDIT: done

OK, here’s what I see in the raw text email:

[Mis/Disinformation starts to overwhelm civilization][2]

 The dark side of generative AI is that it enables the production of misinformation (due to confabulation) and disinformation (i.e. deliberate production of fake news to achieve an end) at industrial scale.   Renderingof web pages in the style of authoritative sources is straighforward, and progress in deep fakes will make video story complements easier.  Asidefrom Vinge’s clouds of fake information to hide information (Rainbows End), which I don’t believe posed a solution, have any SF authors thought about this and how it might be tackled?

note:

  • to overwhelm :white_check_mark:
  • a paperback :white_check_mark:
  • Renderingof :x:
  • Asidefrom :x:

and in the HTML version:

<a href="https://forum.tasat.org/t/mis-disinformation-starts-to-overwhelm-civilization/66" style="text-decoration: none; font-weight: bold; color: #006699;; font-weight:400;line-height:1.3;margin:0;padding:0;text-decoration:none">
<strong>Mis/Disinformation starts tooverwhelm civilization</strong>
…
Rendering of web pages in the style of authoritative sources is straighforward, and progress in deep fakes will make video story complements easier.  Aside from

note:

  • tooverwhelm :x:
  • apaperback :x:
  • Rendering of :white_check_mark:
  • Aside from :white_check_mark:

In the rawest (i.e. encoded) form, these errors are still there:

[Mis/Disinformation starts to overwhelm civilization][2]
            
 The dark = 
side of generative AI is that it enables the production of misinformation (=
due to confabulation) and disinformation (i.e. deliberate production of fak=
e news to achieve an end) at industrial scale.   Renderingof web pages in t=
he style of authoritative sources is straighforward, and progress in deep f=
akes will make video story complements easier.  Asidefrom Vinge=E2=80=99s c=
louds of fake information to hide information (Rainbows End), which I don=
=E2=80=99t believe posed a solution, have any SF authors thought about this=
 and how it might be tackled?
Ken
                                  <a href=3D"https://foru=
m.tasat.org/t/mis-disinformation-starts-to-overwhelm-civilization/66" style=
=3D"text-decoration: none; font-weight: bold; color: #006699;; font-weight:=
400;line-height:1.3;margin:0;padding:0;text-decoration:none">
           =
                         <strong>Mis/Disinformation starts tooverwhelm civi=
lization</strong>

these are not in the raw/cooked:

000000d0: 5265 6e64 6572 696e 6720 6f66 2077 6562  Rendering of web
000000e0: 2070 6167 6573 2069 6e20 7468 6520 7374   pages in the st
000000f0: 796c 6520 6f66 2061 7574 686f 7269 7461  yle of authorita
00000100: 7469 7665 2073 6f75 7263 6573 2069 7320  tive sources is 
00000110: 7374 7261 6967 6866 6f72 7761 7264 2c20  straighforward, 
00000120: 616e 6420 7072 6f67 7265 7373 2069 6e20  and progress in 
00000130: 6465 6570 2066 616b 6573 2077 696c 6c20  deep fakes will 
00000140: 6d61 6b65 2076 6964 656f 2073 746f 7279  make video story
00000150: 2063 6f6d 706c 656d 656e 7473 2065 6173   complements eas
00000160: 6965 722e 2020 4173 6964 6520 6672 6f6d  ier.  Aside from

Not that I didn’t believe you :smiley:

So… spaces are being occasionally removed from the email body, be it the text part or the HTML part. And not in the same places!

I posit that these errors could have been introduced at one of four places:

  • in Discourse, generating the email
  • transmitting the email to a the email submission server
  • transmitting the email to an intermediate/end server
  • delivering to the user mailbox

It’s probably easiest to start at the beginning.

Can you have Discourse submit mail to a local MTA where you can inspect it in the queue prior to the MTA sending it off to your “actual” email delivery server?

Thanks for the analysis, Michael!

I’m not an advanced email admin – I’m running the typical recommended self-installation, with actual outbound email via MailerSend.net, and have carefully configured DKIM/DMARC etc. to a working state. From what I’m reading, incorporating a local MTA like sendmail or Postfix is an advanced move that’s discouraged in most cases… I’m a little apprehensive about mucking around and possibly disrupting a functioning pipeline. :grimacing:

Is there an easily revertible, troubleshooting kind of MTA implementation I might consider?

As noted in edits above, this problem continues with current user-posted content, not just admin cut-&-pasted content — and is now observed with summary, user_replied, and user_posted emails.

MailerSend support confirmed the space characters are missing when they receive the request from Discourse — so it would seem the error is with Discourse generation of the email…?

FWIW, space characters are not missing when previewing a generated summary — only when they’re received as emails.


At the same time, I’m having this issue with summary emails, reported by others starting in February:

These repeated posts are present in the generated summary previews.

EDIT 2024-04-26: the repeated summary issue has been identified. Pending a fix, I’ve mitigated the problem via settings changes, but it doesn’t seem to relate to this topic. Outgoing emails still have missing space characters.


I’ve done a command-line update & rebuild to see if that would flush out any kinks, but it had no effect.

If these things aren’t happening to everyone who’s up-to-date on the tests-passed branch, what might I look into in my setup?

If you can temporarily disable TLS between your server and mailersend, that’ll let you inspect the actual traffic on the wire and it’ll show you whether Discourse is sending the spaces or not, setting this question once and for all.

If you can’t, you could try MITMing the traffic, but that’s more complicated.

If neither of the above work, for this case I would configure a local postfix, but not configure it for direct delivery, rather have it send its email to your mailersend account the same way Discourse does.

That way you can have Discourse send via either method and you can inspect the mail in the postfix queue before it goes out.

Thanks Michael – I’m new at “inspecting on the wire” but here’s what I’ve found.

MailerSend requires TLS and port 587. So:

  • I created an alternate app.yml to send to a free mailtrap.io account on port 2525
  • set DISCOURSE_SMTP_ENABLE_START_TLS = false
  • applied the change with:
cd /var/discourse
./launcher destroy app
./launcher start app
  • set up Wireshark to monitor remote traffic via tcpdump

The email content packets in Wireshark and the unencrypted emails received at Mailtrap do not, so far, have any missing space characters. Specific test digests run back-to-back with each config have missing spaces with my original config and not with the mailtrap version. Could this indicate that the problem is introduced with the TLS encryption?

EDIT: It occurred to me that I didn’t fully utilize the Mailtrap testing setup. I’ve since run several encrypted preview summaries to Mailtrap — on port 587 with TLS enabled — and have not seen any dropped space characters. I’m now thinking that despite MailerSend telling me the issues were present in the received requests, it maaaay be happening on their end after all? Not sure what to have them look for, but I plan to run these findings by them.

2 Likes

(Just in case it helps: I had a quick look at my setup, and didn’t see a problem. So I’d wonder if you have some theme or plugin which affects your setup. What I did was to visit mail-tester.com to get a temporary destination, then use Admin->Emails->Preview Summary to send a summary to the temporary destination, and then click through on mail-tester to view the HTML and plain versions. It might be worth trying the same tactic to see if anything is any different for you.)

Thanks, Ed – to get to mail-tester, my emails would have to go through my MailerSend relay, which is what I was trying to take out of the chain. But your comment did prompt me to return to Mailtrap and run tests with TLS encryption, and I’ve edited my previous post.

1 Like

I’m finding this likely as well.

For a solid test, I would next take one of the plaintext emails you captured and submit it by hand via your MailerSend account using openssl s_client.