Pasting html strips links


(Ted Strauss) #1

I just copied a block of html from an article and pasted it into a fresh topic. About 6 links got stripped. I can see why stripping all html from pasted text is a good idea, I really do. But sometimes you need those links. I ended up using the page source to keep the links. What about adding an item on the toolbar to ‘paste html’, keeping all original html in the source?


(Michael Brown) #2

We support Markdown, not HTML.

Can you show us the page and the text you were trying to copy? If it’s an inline link such as:

then it’s just not going to work.

If you do this a lot, I’d suggest installing a browser extension such as Copy as Markdown which helps out with this task. For example:

Ars Technica has a three pages article on the trajectory of TV–starting

My clipboard watcher says that if I copy the text from the original site, what gets put into my clipboard is:

####text/plain:
Ars Technica has a three pages article on the trajectory of TV--starting

####text/html:

<meta http-equiv="content-type" content="text/html; charset=utf-8"><i style="outline: none; vertical-align: baseline; font-family: Arial, sans-serif; font-style: normal; font-size: 13px; padding: 0px 0px 0px 1em; margin: 0.5em; border-left-width: 3px; border-left-style: solid; border-left-color: rgb(221, 221, 221); display: block; color: rgb(54, 54, 54); font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 19px; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255);">Ars Technica has a three pages article on<span class="Apple-converted-space"> </span><a href="http://arstechnica.com/gadgets/2013/06/the-future-of-tv-a-star-is-born/" style="outline: none; vertical-align: baseline; font-family: inherit; font-style: inherit; font-size: 13px; padding: 0px; margin: 0px; color: rgb(0, 47, 47); text-decoration: underline; cursor: pointer;">the trajectory of TV</a>--starting</i>

Ew. But parsable.


(Ted Strauss) #3

I’m definitely going to install the browser extension. Nice.

But since I’m launching a discourse site for other people, the possibility of ingesting html and stripping out everything except links seems like a killer feature.

I realize this is not trivial to implement. Anyone else think this is good idea? Maybe there’s a strip-everything-but-links.js out there that already does this.


(Lee_Ars) #4

Hey, @supermathie, I wrote that “Trajectory of TV” article, BTW. Thanks for the linkage :wink:


(Michael Brown) #5

I do like the idea! I mean, we can paste images from the clipboard and that’s awesome - no reason we shouldn’t be able to at least strip HTML in a rudimentary way.


(Sam Saffron) #6

well, its tricky sometimes want to keep html.


(Michael Brown) #7

Ah, in this case the user is not pasting HTML. He’s pasting a chunk of text he copied off the webpage, available in the clipboard as both text/plain and text/html (DAMN! text/markdown isn’t available. That’d be SWEET!)

OK, so my understanding of the feature request is basically turning that big blob of HTML (from the clipboard) up there into markdown. So there’s a couple ways of going about it:

###Stage 1
Strip everything out except links, converting <a href="X">Y</a> into [Y](X).

###Stage 2
The above, plus also bring in basic markups such as <i> and <b>.

###Stage 3
???


(Ted Strauss) #8

Stage 3 could be <h1> <h2> and such, but not really needed.
Having just stage 1 would be amazing.


(Kane York) #9

This is implemented. Check out the source to my post: https://meta.discourse.org/raw/7430/9

Ars Technica has a three pages article on the trajectory of TV--starting

(Jeff Atwood) #10