HTML/RTF pasting

This is really really nice, an excellent improvement, @vinothkannans great work :heart_eyes:

6 Likes

Oh, wow, @vinothkannans – this is amazing work! :clap::heart_eyes::clap::heart_eyes::clap:

I got better results out of regular webpages than out of Google Docs, which would be a likely source of content for me. I don’t know what witchcraft they use over there.

Here is my test document: Testing Markdown paste - Google Docs

The following are converted:

  • headings
  • paragraphs
  • links
  • lists
  • images

The following are not converted:

  • inline formatting (italics, bold)
  • nested lists
  • tables

Feel free to mangle the document I linked to above to test other cases.

7 Likes

No. Way. This seemed like a pipe dream not very long ago. Having a few more hands on deck is pretty great!

Nice work, @vinothkannans!

7 Likes

amazing. nice job. I can’t wait to share the good news with my community. what a gift. :gift_heart:

7 Likes

There is a problem pasting from Word:

Document:

Pasted:

My Header

This is the table I’m talking about:

Header 1

Header 2

Bold

Italics

Yellow BG

Underline

RED

Here’s a list:

Potato

Potato

Potato

Plain Text in clipboard

My Header
This is the table I’m talking about:
Header 1	Header 2
Bold	⓭
Italics	Yellow BG
Underline	RED

Here’s a list:
1.	Potato
2.	Potato
3.	Potato

HTML in clipboard

<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml"
xmlns="http://www.w3.org/TR/REC-html40">

<head>
...... many lines deleted .......
</head>

<body lang=EN-US style='tab-interval:.5in'>
<!--StartFragment-->

<h1>My Header<o:p></o:p></h1>

<p class=MsoNormal>This is the table I’m talking about:<o:p></o:p></p>

<table class=MsoNormalTable border=0 cellspacing=0 cellpadding=0 width=147
 style='width:110.0pt;border-collapse:collapse;mso-yfti-tbllook:1184;
 mso-padding-alt:0in 5.4pt 0in 5.4pt'>
 <tr style='mso-yfti-irow:0;mso-yfti-firstrow:yes;height:14.25pt'>
  <td width=73 nowrap valign=bottom style='width:55.0pt;border:solid black 1.0pt;
  mso-border-top-alt:1.0pt;mso-border-left-alt:.5pt;mso-border-bottom-alt:1.0pt;
  mso-border-right-alt:.5pt;mso-border-color-alt:black;mso-border-style-alt:
  solid;background:black;padding:0in 5.4pt 0in 5.4pt;height:14.25pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><b><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
  "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri;
  color:white'>Header 1<o:p></o:p></span></b></p>
  </td>
  <td width=73 nowrap valign=bottom style='width:55.0pt;border:solid black 1.0pt;
  border-left:none;mso-border-left-alt:solid black .5pt;mso-border-top-alt:
  1.0pt;mso-border-left-alt:.5pt;mso-border-bottom-alt:1.0pt;mso-border-right-alt:
  .5pt;mso-border-color-alt:black;mso-border-style-alt:solid;background:black;
  padding:0in 5.4pt 0in 5.4pt;height:14.25pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><b><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
  "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri;
  color:white'>Header 2<o:p></o:p></span></b></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:1;height:14.25pt'>
  <td width=73 nowrap valign=bottom style='width:55.0pt;border:solid black 1.0pt;
  border-top:none;mso-border-top-alt:solid black .5pt;mso-border-alt:solid black .5pt;
  background:#D9D9D9;padding:0in 5.4pt 0in 5.4pt;height:14.25pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><b><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
  "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri;
  color:black'>Bold<o:p></o:p></span></b></p>
  </td>
  <td width=73 nowrap valign=bottom style='width:55.0pt;border-top:none;
  border-left:none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
  mso-border-top-alt:solid black .5pt;mso-border-left-alt:solid black .5pt;
  mso-border-alt:solid black .5pt;background:#D9D9D9;padding:0in 5.4pt 0in 5.4pt;
  height:14.25pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
  "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri;
  color:black'>⓭<o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:2;height:14.25pt'>
  <td width=73 nowrap valign=bottom style='width:55.0pt;border:solid black 1.0pt;
  border-top:none;mso-border-top-alt:solid black .5pt;mso-border-alt:solid black .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:14.25pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><i><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
  "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri;
  color:black'>Italics<o:p></o:p></span></i></p>
  </td>
  <td width=73 nowrap valign=bottom style='width:55.0pt;border-top:none;
  border-left:none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
  mso-border-top-alt:solid black .5pt;mso-border-left-alt:solid black .5pt;
  mso-border-alt:solid black .5pt;background:yellow;padding:0in 5.4pt 0in 5.4pt;
  height:14.25pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
  "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri;
  color:black'>Yellow BG<o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:3;mso-yfti-lastrow:yes;height:14.25pt'>
  <td width=73 nowrap valign=bottom style='width:55.0pt;border:solid black 1.0pt;
  border-top:none;mso-border-top-alt:solid black .5pt;mso-border-alt:solid black .5pt;
  mso-border-bottom-alt:solid black 1.0pt;background:#D9D9D9;padding:0in 5.4pt 0in 5.4pt;
  height:14.25pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><u><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
  "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri;
  color:black'>Underline<o:p></o:p></span></u></p>
  </td>
  <td width=73 nowrap valign=bottom style='width:55.0pt;border-top:none;
  border-left:none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
  mso-border-top-alt:solid black .5pt;mso-border-left-alt:solid black .5pt;
  mso-border-alt:solid black .5pt;mso-border-bottom-alt:solid black 1.0pt;
  background:#D9D9D9;padding:0in 5.4pt 0in 5.4pt;height:14.25pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
  "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri;
  color:red'>RED<o:p></o:p></span></p>
  </td>
 </tr>
</table>

<p class=MsoNormal><o:p>&nbsp;</o:p></p>

<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><i style='mso-bidi-font-style:
normal'>Here’s a list:<o:p></o:p></i></b></p>

<p class=MsoListParagraphCxSpFirst style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span
style='mso-fareast-font-family:Calibri;mso-fareast-theme-font:minor-latin;
mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'><span
style='mso-list:Ignore'>1.<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]>Potato<o:p></o:p></p>

<p class=MsoListParagraphCxSpMiddle style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span
style='mso-fareast-font-family:Calibri;mso-fareast-theme-font:minor-latin;
mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'><span
style='mso-list:Ignore'>2.<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]>Potato<o:p></o:p></p>

<p class=MsoListParagraphCxSpLast style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span
style='mso-fareast-font-family:Calibri;mso-fareast-theme-font:minor-latin;
mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'><span
style='mso-list:Ignore'>3.<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]>Potato<o:p></o:p></p>

<p class=MsoNormal><o:p>&nbsp;</o:p></p>

<!--EndFragment-->
</body>

</html>

So it doesn’t seem to be working because Microsoft Word doesn’t convert bullet/numbered lists to <ol> or <ul> tags.

How is this in scope in any form? Colors, etc, are not Markdown features. If you want that, screenshot and paste an image.

Just threw them in for testing. I believe BOLD and italics are MarkDown.

There is a plugin that will give font colors with [color] codes. Not official, I know, but would be nice.

Also, I’m just pointing out to the fact that lists in Word doesn’t come through, which is definitely in scope. Not a fault of the system but Word (it doesn’t translate lists to the appropriate HTML tags), but still needs handling just because it is Office, you know.

EDIT: And the table embedded in Word is not converted correctly also. But this is already addressed:

Yeah, I can repro – a list in Word doesn’t come through as a list, neither bulleted nor numbered.

Paragraph

1.      This

2.      Is a

3.      Numbered list

Paragraph

-         This

-         Is a

-         Bulleted list

Paragraph

Would be good to preserve lists from Word pastes, I can agree with that @vinothkannans – the issue is the hidden tab character that Word is inserting here:

image

4 Likes

Okay I will look at these issues :+1:

5 Likes

How did you get the plain text to paste?

When I paste, I get this:

This is a list

1.      
Potato

2.      
Potato

3.      
Potato

because it is using the HTML version instead of the plain-text version, and Word’s HTML version converts lists to <p> tags.

Pasting the plain-text version in the clipboard will get this:

This is a list

  1. Potato
  2. Potato
  3. Potato

which works fine even with the tab characters.

1 Like

Test 1 (select 3 paragraph and delete 1 + 3 in reply):

This is a list 1. Potato 2. Potato 3. Potato

Test 2 (select only text in code):

This is a list 1. Potato 2. Potato 3. Potato

Can’t tell if it was always like this but pasting CSS straight from browser devtools seems to be affected.

What it looks like

.select-box-kit.is-expanded .select-kit-body, .select-kit.is-expanded .select-kit-body {

  1. display: -webkit-box;
  2. display: -ms-flexbox;
  3. display: flex;
  4. -webkit-box-orient: vertical;
  5. -webkit-box-direction: normal;
  6. -ms-flex-direction: column;
  7. flex-direction: column;
  8. left: 0;
  9. position: absolute;
  10. top: 0;

}

What it's supposed to look like

.select-box-kit.is-expanded .select-kit-body, .select-kit.is-expanded .select-kit-body {
display: -webkit-box;
display: -ms-flexbox;
display: flex;
-webkit-box-orient: vertical;
-webkit-box-direction: normal;
-ms-flex-direction: column;
flex-direction: column;
left: 0;
position: absolute;
top: 0;
}

Copying code from StackOverflow (which you should never do :upside_down_face: ) is affected.

Steps to reproduce:

1- Go to this answer
2- Copy the script part.
3- Paste in composer.

Result:

function makeUnselectable(node) { if (node.nodeType == 1) { node.setAttribute("unselectable", "on"); } var child = node.firstChild; while (child) { makeUnselectable(child); child = child.nextSibling; }
} makeUnselectable(document.getElementById("foo"));

Expected result (works if you use paste as plain text or ctrl + shift + v ):

function makeUnselectable(node) {
    if (node.nodeType == 1) {
        node.setAttribute("unselectable", "on");
    }
    var child = node.firstChild;
    while (child) {
        makeUnselectable(child);
        child = child.nextSibling;
    }
}

makeUnselectable(document.getElementById("foo"));

Chrome Version 63.0.3239.84 (Official Build) (64-bit) - latest
Win 7

4 Likes

We are going to put HTML and Excel table rich paste behind a feature flag for one more week at least while we refine it. Default off. It will remain enabled on meta though.

Helps us relieve some of the pressure. Then in a week or so we can decide if we want to include this in 1.9 or not.

9 Likes

I don’t agree with part of this – the excel table paste should be 100% safe and should make the 1.9 release. There is just no way that excel data could be interpreted as anything else.

I do tend to agree the HTML part is way too risky to take on at this time, though.

So let’s make the feature flag about the HTML paste, and push the Excel table paste through… since that’s what this was originally about before it got all scope-creeped up :wink:

13 Likes

I’d like to add another pasting feature request:
Automatically remove line breaks

Sometimes I’ve to copy & paste text from PDF docs. Maybe there is some way to detect unwanted characters "- "and repair the words :slight_smile: Various JS-based online tools didn’t work for me.

Speaking of feature creep, it would be nice if you could paste without magical formatting. Could shift-paste just paste like the good old days?

1 Like

You can disable rich text pasting from the site setting enable_rich_text_paste. Yes always you can use shift-paste to plain text pasting.

4 Likes

CTRL+SHIFT+V works for me.

2 Likes

On one hand, well, of course. On the other, this continues to be just awesome. I haven’t used this enough to have tried.