I’m working on trying to migrate a google group to a Discourse forum and am having some issues with HTML being posted in emails. The mailing list is for a web framework, so folks posting HTML back and forth is pretty common. Since the emails we’re trying to import were coming from a system that didn’t support markdown, these HTML segments wouldn’t include code fences around them and there are too many to manually edit.
As a compromise, I’d like to disable HTML entities in posts completely. While searching the forum for this I saw @codinghorror mention that @eviltrout added a mode where this was possible, but I couldn’t find the magic setting in my admin panel.
Does anyone know how to flip Discourse into this mode?
Without thinking about it too hard, I think that the thing to do might be to encode the HTML tags before the import. It sounds like it might be safe enough to just convert all of the <s and >s into <s and >s.
That’s true. Although I’ll admit I’m also concerned about folks interacting with our community via email. Including HTML without a fence should, ideally, only render uncolored HTML. Anything else will likely cause frustration.
Oh. That’s a bigger problem. I have a BS in computer science and a PhD in education. I have learned that programming people is lots harder than programming computers.
There’s no mode to disable HTML entities that I recall. There is a type of post you can mark as HTML (see the mbox importer for an example) and no cooking or processing will be done.
Got it. I don’t think that would be desirable either.
What I really need is something that, upon seeing raw HTML in a post, will escape the HTML before the Markdown processor converts Markdown to HTML. In poking around the code I noticed that there’s a tags whitelist of sorts, so I could see just disabling that as a possibility. Or, potentially, preprocessing the text before the markdown processor runs.
Are either of these doable in the form of a plugin?
Yeah the new markdown engine has a rule for dealing with HTML tags it could be replaced in a plugin, but what you would have at the end of the process is not CommonMark, it is some sort of mister hydra