Why does markdown do this? (whitespace trimming and other features)

Apologies for the vague title,

I’m in the process of modifying some of the functionality with markdown on discourse to better suit the needs of my community which is coming over from a mybb forum.

Discourse is my first exposure to markdown, and I want to take the approach of “don’t tear down a fence if you don’t know why it’s there.” However, I am having difficulty finding out why markdown takes certain actions and I would love some resources to help me understand, (especially if there are some security concerns I could be overlooking).

The response in this topic provided some good general references including some early discussion and a (now defunct) plugin:

And this post has been incredibly helpful on the development side of things:

But, again, I’m looking to understand why certain features exist - so i can best evaluate removing or tweaking them.

The current features in my crosshairs are:

  1. Removing consecutive line breaks without </br> or other code.
  2. Automatic creation of code blocks on lines with 4 or more leading spaces.
  3. Trimming leading whitespace on a new line.
  4. Turning any sequence of numbers into a numbered list in consecutive increasing order.

Any guidance is appreciated! Thanks!

4 Likes

I’m very interested by this topic but I don’t have the knowledge to answer most of your questions.

However,

I think Discourse wants to keep posts clean and easy to read.
By removing extra line breaks and trimming spaces at the beginning of lines, it ensures posts look consistent.
This helps especially when users, sometimes even without realizing (I know some!), add random spaces or lines which could make their content harder to read.

It may look weird to force this, especially when you actually want a gap in your list, but my guess is that it’s to keep numbered lists as simple as possible, with no need to re-number every element when you add a new one to your post (or in a wiki post).

5 Likes

Actually keeping everything really thight can make things harder to read. Paragraph after an image can be an example.

But… I’m using <br /> so no biggie at all.

Another but… I’m quite sure that cleaning operation has nothing to do with cleaner reading experience. It comes from code itself. Similar thing than very tired statement that markdown is easier to read on ”code level”. End user is never on that situation.

Well. This is pure meta now.

But I refuse to believe cleaning lines and extra spaces is actually making overall experience better to end user.

2 Likes

I’m not sure if this will solve a problem for you, but Discourse has a traditional markdown linebreaks site setting that is disabled by default. When enabled, two trailing spaces are required to create a line break.

Here’s an example with that setting enabled:

Here’s an example with the setting disabled (the default value of the setting):

It might be worth looking at this page and trying its 10 minute markdown tutorial: Markdown Reference. The Discourse new user tutorial links to that page, but it likely gets overlooked by a lot of users who take the tutorial:

5 Likes

Definitely makes ascii art a challenge to display properly. :wink:

I really appreciate the insight. My community has a large forum game/storytelling sub-community where freedom in formatting tends to be worth more than consistency. If the core reason for these features is really to help maintain a consistent and simplified look, it is probably safe to ax them for my use case, or to at least give them a user selectable toggle.

This is a very helpful tutorial, and its emphasis on “making beautiful text” does seem to follow the ideas of consistency mentioned earlier. Thanks for pointing it out.

5 Likes

In doing some extra reading about markdown I have found this site particularly enlightening.

This is, I think, the original description of markdown, as cited by the commonmark team, and it reveals that a core element of why markdown does what it does is due to its proximity to html.

For instance the first sentence there is:

Markdown is a text-to-HTML conversion tool for web writers.

And I think that some of these quirks of markdown may be less due to their organizational properties and more to a desire to cleanly convert the text to corresponding html code.

This explains why whitespace is trimmed, because html trims its whitespace.

Also the reason for turning any sequence of numbers into an increasing numbered list becomes a little clearer with:

https://daringfireball.net/projects/markdown/syntax#list

So markdown doesn’t even bother looking at the numbers, because ordered lists in html don’t bother assigning numbers. (The actual code in markdown-it does check the character in the first position, but only to start the numbering from that character).

I think there may be more to it beyond this, (such as formatting consistency across multiple devices) but some of the quirkier aspects do seem very html inspired.

2 Likes