Coincidental Markdown in formatted content pasted to rich text editor rendered on publish

Priority/Severity:

Medium

Platform:

Operating System

  • Windows 11

Browser

  • Google Chrome 141.0.7390.123

Discourse

fb4bd7951aa6ae8c814df702807c12ccb77bd3fd

Description:

The “rich text editor” is intended to provide a “WYSIWYG” experience, where the content seen in the composer is rendered exactly as it will be in the published post.

Text copied from some sources may be stored in the clipboard in a formatted form (text/html type) in addition to plaintext (text/plain).

When text is pasted into the composer, if a formatted data type is present in the clipboard then this data is used instead of the plaintext type.

Due to the minimal nature of the syntax, it is common for text to contain content that coincidentally happens to resemble Markdown markup.

:bug: The published post is be rendered differently from what is shown in the “rich text editor” when pasted formatted content contains certain coincidental Markdown markup.

Reproducible steps:

Coincidental list markup

  1. Create an HTML file with the following content:
    <html>
      <body>
    <br />- foo
      </body>
    </html>
    
  2. Open the file in your web browser.
  3. Copy the content of the web page.
  4. Open the post composer.
  5. Put the composer into the “rich text editor” mode.
  6. Paste the copied content into the composer.
    :slightly_smiling_face: The text was pasted verbatim, rather than being rendered as a list:

    - foo

  7. Publish the post.

:bug: Instead of matching what was seen in the composer, the content has been rendered as an unordered list:

  • foo

Coincidental code block markup

  1. Create an HTML file with the following content:
    <html>
      <body>
        <span style="white-space: pre">    foo</span>
      </body>
    </html>
    
  2. Open the file in your web browser.
  3. Copy the content of the web page.
  4. Open the post composer.
  5. Put the composer into the “rich text editor” mode.
  6. Paste the copied content into the composer.
    :slightly_smiling_face: The text was pasted verbatim, rather than being rendered as a code block:

        foo

  7. Publish the post.

:bug: Instead of matching what was seen in the composer, the content has been rendered as a code block:

foo

Coincidental code block markup w/ coincidental BBCode

  1. Create an HTML file with the following content:
    <html>
      <body>
        <span style="white-space: pre">    [foo]</span>
      </body>
    </html>
    
  2. Open the file in your web browser.
  3. Copy the content of the web page.
  4. Open the post composer.
  5. Put the composer into the “rich text editor” mode.
  6. Paste the copied content into the composer.
    :slightly_smiling_face: The text was pasted verbatim:

        [foo]

  7. Publish the post.

:bug: Instead of matching what was seen in the composer, the content has been rendered as a code block, with backslashes prepended to the brackets:

\[foo\]

Additional context:

I think it is correct for the rich text editor to ignore apparent Markdown markup present in pasted content that has the “text/html” type. Any intentional formatting in such content will be defined by HTML tags, so content that resembles Markdown is most likely to be coincidental rather than true markup. So the fault here is that the coincidental Markdown syntax is being rendered on publish; not that it is not rendered in the composer.


I am able to reproduce the fault on try.discourse.org in “safe mode”.