Markdown rendering issue with image surrounded with HTML

angus · August 11, 2021, 2:19am

This issue has cropped up again

Just thinking out-loud here, but I wonder if we can elide the tricky problem here (i.e. the conversion of HTML to markdown). To recap (just to help think this through)

Discourse supports the importation of HTML for the creation of post content (e.g. HTML from WP Discourse).
In some contexts the user expects the integrity of the original HTML to be retained exactly.
“integrity” here has at least two aspects:
1. How the content is rendered, e.g. linebreaks
2. Where media is hosted, e.g. downloading images to local to avoid broken images, or potentially for security concerns
The conversion of HTML to markdown potentially creates issues for the first type of integrity, however it is currently necessary to ensure the second type of integrity.

So perhaps one way to address this issue for certain imported posts would be for the imported HTML to be stored directly as the cooked post content, and the pull_hotlinked_images job would support downloading images in such content without converting img to markdown.

Yes, put more simply, perhaps the code could support downloading hotlinked images without requiring a conversion of the img to markdown. For such posts you would interpolate the downloaded image url in the cooked content instead of the raw.

Topic		Replies	Views
Thumbnail generation & markdown rendering issue Support	14	1912	May 30, 2020
Fix broken images for posts created by the WP Discourse and RSS plugins Administrators wordpress , rss-polling , how-to	33	4036	August 20, 2021
Run-together markdown formatting mixed with html makes images not load Support	13	1266	August 20, 2021
Images not publishing to Discourse in WP 5.3 WordPress	26	4039	August 13, 2020
System auto converts images in HTML into markdown, breaking them Support	5	689	July 17, 2022

Markdown rendering issue with image surrounded with HTML

Related topics