Get back the real "raw" data that created a post?

If you’re using chrome desktop then you can use Ctrl+Shift+V to paste as plain text instead of notepad.

Also in the admin site settings page, you can disable this fancy pasting behavior by unchecking the “enable rich text paste” option.

2 Likes

Thanks, but my goal is to keep the original formatting from microsoft word (in the raw data), not strip it down to plain text.

Currently, Discourse removes a variety of the formatting–such as font size and font type, while leaving others (bold and italics seem to stay there). As a comparison, Gmail composer seems to leave all formatting.

To try to keep more formatting, I suppose an alternative would be to upload a word document (not paste in text to the composer–but upload the actual doc). The problem there, I think, is that discourse does not show the contents of the upload inline (it just displays an upload link).

Then paste the raw HTML.

There are different goals involved. Gmail’s goal is to preserve all the formatting, whereas a forum’s goal is to keep a meaningful subset of it while preventing abuse (huge text, blinking text, OVERLY LARGE FONTS, annoying colours, etc.)

As a simple example, here is some simple text’s HTML as generated by Office, present in the clipboard as text/html:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
	<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
	<title></title>
	<meta name="generator" content="LibreOffice 7.1.2.2 (Linux)"/>
	<style type="text/css">
		@page { size: 21.59cm 27.94cm; margin: 2cm }
		p { margin-bottom: 0.25cm; line-height: 115%; background: transparent }
	</style>
</head>
<body lang="en-CA" link="#000080" vlink="#800000" dir="ltr"><p style="margin-bottom: 0cm; line-height: 100%">
<b>Hello there</b><span style="font-weight: normal"> sunshine </span><i><span style="font-weight: normal">eh</span></i></p>
</body>
</html>

It looks like:

@page { size: 21.59cm 27.94cm; margin: 2cm } p { margin-bottom: 0.25cm; line-height: 115%; background: transparent }

Hello there sunshine eh

But when interpreted by to-markdown.js you get:

**Hello there** sunshine *eh*

Hello there sunshine eh

You cannot, unless you put it there yourself as I did in this post. If you really need it there, hide it in a comment. If you want to later convert it yourself to markdown, use something like pandoc.

2 Likes

I support transferring to markdown on the post that is displayed to others for some consistency. It’s the raw entry I’m curious about.

How do you get the raw html of a word doc that you could paste in?

Save it as HTML from Word, or copy it to the clipboard and pull the text/html explicitly from the clipboard.

4 Likes

If all you care about is being able to reference the raw input at a later point, this component might work for you.

It adds a button to the post menu that shows the raw content on a per-post basis.

5 Likes