Discourse inserting Unwanted <br>

When a user sends a message to our community via email, using Outlook, unwanted line feeds are being inserted.
For example, the user sends:

<p class=3D"MsoNormal"><span style=3D"font-size:16.0pt;font-family:&quot;Ti=
mes New Roman&quot;,serif;color:#002060">The Fair Defense Act requires atto=
rneys seeking appointments for juvenile delinquency cases to have averaged =
twelve (12) hours a year of continuing legal education
 courses or other training relating specifically to juvenile law.&nbsp; Thi=
s seminar will satisfy the required annual 12 hours of CLE if you do not al=
ready have them.<o:p></o:p></span></p>

and Discourse inserts a <br> between education and courses.

Or the user sends:

<p class=3D"MsoNormal"><span style=3D"font-size:16.0pt;font-family:&quot;Ti=
mes New Roman&quot;,serif;color:#002060">If you
<i>are</i> a member of TCDLA, you can register here:&nbsp; </span><b><span =
style=3D"font-size:14.0pt;font-family:&quot;Times New Roman&quot;,serif;col=
or:red"><a href=3D"[URL]><span style=3D"color:red">Click here
 to register on TCDLA</span></a></span></b><span style=3D"font-size:16.0pt;=
font-family:&quot;Times New Roman&quot;,serif;color:#002060"><o:p></o:p></s=
pan></p>

and Discourse inserts a <br> between you and <i>are</i>

When I send the same message from Fastmail, I don’t get the same result. So I’m willing to blame this on Outlook, but before I do, is this a Discourse bug?

Your examples show the text as being on separate lines, I’d expect it to insert a <br /> there?

Yes, it appears that Outlook is putting in a line break that an HTML browser doesn’t interpret as a line break (so that if you send or receive the email there is no line break there) but Discourse does.

It looks like more than that alone is happening. I assume the "3D"s are hex for the equal sign, which would be redundant and AFAIK likely cause breakage. And the single quotation marks around the font family are entities instead of single quotes. Then there is an empty namespaced p element which I’m guessing is for the default spacing instead of using CSS to create spacing. etc.

Maybe the Outlook email client knows how to deal with that mess, but I don’t think it should be expected that anything else would be able to.

Would you know the exact version of outlook (and platform) that is mangling our emails? When we send mail out we are super careful to have valid HTML.

I read it as relating to inbound email, Discourse interpreting the CRLF which Outlook adds to HTML emails as </ br>

HTML ignores line breaks and interprets all text between tags as contiguous. It appears that there’s some kind of plaintext-to-html happening before the HTML is rendered out.

2 Likes

@zogstrip any ideas here?

@Mark_W_Bennett can you ask your user to check the following setting:

File > Options > Advanced
International Options > Preferred Encoding for Outgoing Messages

Is it set to Unicode (UTF-8)?

If not can you ask them to change it to UTF-8 and re-test? Outlook may need to be restarted after they change that setting.

2 Likes

@Mark_W_Bennett Have you tried enabling the “traditional markdown linebreaks” site setting? That might help.

@Stephen How is encoding related to the issue discussed here?

1 Like

The user reports that it is set to UTF-8.

@zogstrip I have not, but I will.

1 Like

It’s a long-standing problem with Outlook (best part of a decade) where if the client isn’t set to UTF-8 it will add additional line breaks to a message. Plagued us back in the days of Lotus, which ignored both the 77 column wrap and misinterpreted those breaks within mesasges.

2 Likes