CommonMark testing started here!

Per CJK, I found a corner case though I believe it’s in the spec somehow. I am full of surprised when I meet this case.

测试**fun()**可以

测试fun可以

测试fun()可以
测试
fun
可以

I would rather expect:

测试fun()可以

测试fun可以

测试fun()可以
测试fun可以

The test code is:

测试**fun()**可以

测试**fun**可以

测试**fun()**可以
测试**fun**可以
2 Likes

We are behaving according to spec here:

https://johnmacfarlane.net/babelmark2/?text=>+测试**fun()**可以 >+ >+测试**fun**可以 >+ >+测试**fun()**可以 >+测试**fun**可以

Have you tried opening an issue about this on talk.commonmark.org

3 Likes

:email, class_name: is not an emoji…

 :email, class_name:  is not an emoji... 
4 Likes

FYI I fixed all the edge cases reported and deployed the fixes

6 Likes

Sure, I posted a question there just now. For the record, it’s here. `测试**foo**可以` works but `测试**foo()**可以` fails - Spec - CommonMark Discussion

4 Likes

There seems to be a regression with HTML table support since the move to markdown.

The following used to work:

<table>
  <tr>
    <th>Markdown</th>
    <th>BBCode</th>
    <th>HTML</th>
  </tr>
  <tr>
    <td> <pre>
1. Item 1
1. Item 2
 1. Nested Item 1
 1. Nested Item 2
1. Item 3
</pre> </td>
    <td> <pre>
[ol]
  [li]Item 1[/li]
  [li]Item 2[/li]
    [ol]
      [li]Nested item 1[/li]
      [li]Nested item 2[/li]
    [/ol]
  [li]Item 3[/li]
[/ol]
</pre> </td>
    <td>
<pre>
<ol>
  <li>Item 1</li>
  <li>Item 2</li>
    <ol>
      <li>Nested item 1</li>
      <li>Nested item 2</li>
    </ol>
  <li>Item 3</li>
</ol>
</pre> </td>
  </tr>
</table>

However, the HTML is now rendered with very strange spacing (see below). The only solution I could find was to replace the < with &lt; and > with &gt; to avoid the HTML.

Markdown BBCode HTML
1. Item 1
1. Item 2
 1. Nested Item 1
 1. Nested Item 2
1. Item 3
[ol]
  [li]Item 1[/li]
  [li]Item 2[/li]
    [ol]
      [li]Nested item 1[/li]
      [li]Nested item 2[/li]
    [/ol]
  [li]Item 3[/li]
[/ol]
  1. Item 1
  2. Item 2
    1. Nested item 1
    2. Nested item 2
  3. Item 3

An <ol> as a child of an <ol> isn’t valid HTML so it can’t be expected to display correctly

<ol>: The Ordered List element - HTML: Hypertext Markup Language | MDN

<ol>
  <li>first item</li>
  <li>second item  <!-- closing </li> tag not here! -->
    <ol>
      <li>second item first subitem</li>
      <li>second item second subitem</li>
      <li>second item third subitem</li>
    </ol>
  </li>            <!-- Here's the closing </li> tag -->
  <li>third item</li>
</ol>
4 Likes

Interesting…so the CommonMark switched fixed bad HTML processing? I had the same issue with an unordered list - do I have bad HTML here too?

Markdown BBCode HTML
* Item 1      - Item 1      + Item 1
* Item 2  or  - Item 2  or  + Item 2
* Item 3      - Item 3      + Item 3
[ul]
  [li]Item 1[/li]
  [li]Item 2[/li]
  [li]Item 3[/li]
[/ul]

  • Item 1
  • Item 2
  • Item 3
<table>
  <tr>
    <th>Markdown</th>
    <th>BBCode</th>
    <th>HTML</th>
  </tr>
  <tr>
    <td> <pre>* Item 1      - Item 1      + Item 1
* Item 2  or  - Item 2  or  + Item 2
* Item 3      - Item 3      + Item 3</pre></td>
    <td> <pre>
[ul]
  [li]Item 1[/li]
  [li]Item 2[/li]
  [li]Item 3[/li]
[/ul]
</pre></td>
<td> <pre>
<ul>
  <li>Item 1</li>
  <li>Item 2</li>
  <li>Item 3</li>
</ul>
</pre></td>
  </tr>
</table>

The <ul> example you posted doesn’t have a nested <ul> (if it did, it would be the same. <dl> too)

Discourse may have done / be doing some correction of mark-up, but AFAIK, it is often the browser that “fixes” it. Some may add a tag where they think one should be, others may remove tags where they think one should not be, they might ignore tags, etc. Testing pages in multiple browsers can reveal “it looks good in x and y browser, but bad in z browser” problems, but IMHO using the W3C validator https://validator.w3.org/ is one of, if not the best ways to ensure consistency.

1 Like

Is this rendering correctly:

<strike>some text </strike> some more text

It comes out lije this:
some text some more text

I would expect it to render like this:
some text some more text

Why, that is how HTML renders it?

Okay, if that is so, I rest my case.

I have a html-related issue. My discourse is also a commenting system and imports using an rss feed html.


<div><div>
      <p>blabla</p>
        
      
    </div></div>

<hr>
<small>This is a companion discussion topic for the original entry at <a href="#">url</a></small>

The following code is rendered in the forum as follows. The closing divs are interpreted as code instead of html. Is this not supported anymore by the new markdown engine or is this a bug?

Preview:

blabla

</div></div>

This is a companion discussion topic for the original entry at url

SPACESPACESPACESPACEAnything

Is a code block per CommonMark spec, it has been like this for many many years.

However, it is not safe to assume that no html content pulled in via an RSS feed will never use such a combo. So couldn’t Discourse strip redundant spaces in the html code before posting the content into a topic?

It depends if the RSS feed is HTML or Markdown, we have a component that we use to convert HTML to Markdown for incoming email, it is possible it could be leveraged here optionally. But… I do not think this belongs in this discussion, instead open a new topic with detailed information about the problem and suggested solutions.

2 Likes

How would Discourse know if those extra spaces were there on purpose or because some HTML generator just loves spaces?

Can you fix the thing generating the HTML not to have four spaces at the beginning of a line? That seems like the place to fix.