Help us to test the HTML pasting

We’ve done a big update in HTML pasting.

It is still an experiment feature only. You have to manually enable it via site setting enable_rich_text_paste to use it in your Discourse instances.

Available pasting options

Headings

h1 Heading 😎

h2 Heading

h3 Heading

h4 Heading

h5 Heading
h6 Heading
Emphasis

This is bold text

This is bold+italic text

This is italic text

This is italic text

Strikethrough

Blockquotes

Blockquotes can also be nested…

…by using additional greater-than signs right next to each other…

…or with spaces between arrows.

Lists

Unordered

  • Create a list by starting a line with +, -, or *

  • Sub-lists are made by indenting 2 spaces:

    • Marker character change forces new list start:

      • Ac tristique libero volutpat at

      • Facilisis in pretium nisl aliquet

      • Nulla volutpat aliquam velit

  • Very easy!

Ordered

  1. Lorem ipsum dolor sit amet

  2. Consectetur adipiscing elit

  3. Integer molestie lorem at massa

  4. You can use sequential numbers…

  5. …or keep all the numbers as 1.

Start numbering with offset:

  1. foo
  2. bar
Code

Inline code

Indented code

// Some comments
line 1 of code
line 2 of code
line 3 of code

Block code “fences”

Sample text here...

Syntax highlighting

var foo = function (bar) {
  return bar++;
};

console.log(foo(5));
Tables
Option Description
data path to data files to supply the data that will be passed into templates.
engine engine to be used for processing templates. Handlebars is the default.
ext extension to be used for dest files.
Links

link text

Images

image

Allowed HTML tags
  • <ins>
  • <del>
  • <small>
  • <big>

Current Limitations

  • Not converting style attributes to Markdown. (like style="font-weight: bold")

  • Tables should have at least two rows and columns. Column count in 1st row shouldn’t lesser than other rows.

  • Complex tables with nested formatting have limited support.

  • Copying ordered lists (ol tags) from MS Word have limited support. It will be treated same like ul tag.

If HTML to Markdown conversion failed for any reason then it will automatically fallback to plain text pasting.

22 Likes

It is enabled here. I also made sure it is enabled at try.discourse.org as well so people can try it out there in the sandbox.

13 Likes

This feature is f*n delightful. Excellent work.

9 Likes

Would be awesome if you can help try breaking it, I am running out of edge cases real quick :upside_down_face:

8 Likes

I aim to please, Sam.

Inline formatting from Google Docs doesn’t work. I hope the great Vinoth can dispel their dark magic. You can copy any bold or italic text from my test document to test this.

Other than that, the only thing that didn’t seem to be working is merged table cells –
but I don’t recall whether CommonMark even supports them (Pandoc did, but I couldn’t find a mention of this on the CommonMark site). So this is probably moot.

One thing that I’m not fond of in the current implementation is that it uses “loose” list items by default. I.e, this:

- List item 1

- List item 2

- List item 3

Instead of this:

- List item 1
- List item 2
- List item 3

I strongly prefer the latter over the former, which creates paragraphs in addition to list items. I have arguments in favor of my preference, but I realize that it is just that – a matter of taste/use cases/etc. If it’s not desirable to switch to the tight list items by default, is it possible to switch to offer an option to do so?

Edit: This.

5 Likes

I agree lists should be tight by default @vinothkannans.

5 Likes

Okay I am going to tighten the list now.

This is because Google docs using style attributes instead of inline tags like <b>, <i>, etc.,. As I mentioned we are not supporting it currently. But it’s doable. So we may add support for that after we reach a stable position in this feature.

Yes. It is tricky. Currently we are not supporting it. Let me check it with Markdown first.

5 Likes

I knew there was some sorcery going on there! I didn’t realize that that’s what you were referring to. I would love to see this implemented sometime as copy-pasting from Google Docs is a common use case for us.

The merged table cells was mostly a “what if”, since Sam asked for edge cases. It’s not something that’s ever come up my use of Discourse. (I had trouble pasting a Google Doc table with merged cells in my CMS yesterday, which is what gave me the idea to try.)

4 Likes

I guess you will have more use cases for HTML pasting as a OP of this topic. So please feel free report bugs, post opinions and edge case problems.

Edit: @alehandrof fixed. Now lists are tightened by default.

4 Likes

this is fantabulous - so exciting.

One issue I came across today - I just updated to latest. Previously (last week? two weeks ago? don’t remember), copy/pasting formatted newsletters from google email into discourse posts worked as you’d expect, with all pictures and formatting. Now it just pastes in whatever text there was without any formatting. I tried several and got the same result. For example, the newsletter below which is just one big image with a bit of text at the bottom just pastes the text at the bottom with no image. Copy/pasting from medium.com worked, so I can confirm the feature is working otherwise.

I guess this what happened. Is it working for other emails? (other than newsletters?). I will check in mine.

It does indeed work for bold and italics and a single image attached in an email.

Yes. I am able to reproduce it. Thanks for your report. I am checking it.

3 Likes

This is not an improvement wrt Microsoft Word, which uses classes for things like lists instead of tags like <ul> and <ol>. Why? must ask Microsoft…

It would definitely be a big plus to get Word working… Word and Docs are the two most popular word processors in the world…

EDIT: Check out HTML EDITOR.in: Free Online WYSIWYG HTML and HTML5 Editor which seems to have it working. It converted from complex Word documents quite fine!

Ironically, this version: HTML EDITOR.in: Free Online WYSIWYG HTML Editor which uses TinyMCE chokes on the bulleted lists, indicating custom logic in the original converter.

3 Likes

Good catch! They also using <![if !supportLists]> for ul and ol tags.

AFAIK both are working fine except inline formatting in Google Docs and ol ul tags in Microsoft Word. At least we don’t any regressions here. We already fallback to plain text paste in problematic situations.

Both needs some application specific workarounds which I will do after fixing all bugs in the current version.

It is not converting at all. Since it supports HTML it’s just displaying what it receive. If you look at the source then you can understand.

4 Likes

Oh! And I thought it’d be a simple matter of copy-and-paste here… :sweat_smile:

Hello friends,
I’ve noticed some strange behaviour with pasting pdf content. It seems the markdown formating commands are missing, but I’ve no clue why, the pasted text is shown with headings.

Adobe Acrobat Pro: Copy

Discourse result: Paste

UPDATE:
Ahhh okay, I understood: It seems the dot (•) under the text is responsible.

This is plain text paste only. Because markdown lists will have * instead of dots. I am not sure about PDF files whether it have HTML copy support or not. I will check.

1 Like

Unsure if this is within scope but links from Google Sheets aren’t converted. They look like this, as formulas.

=HYPERLINK("https://example.com","My example site")
1 Like