Fix broken images for posts created by the WP Discourse and RSS plugins

There is a case where images published to Discourse through the WP Discourse and RSS plugins can be broken. This can happen when the full post content is published to Discourse with the WP Discourse plugin and the WordPress Classic editor is used for publishing the post. It can also happen with posts pulled to Discourse with the RSS Polling polling when the Truncate the embedded posts Embedding setting is not enabled.

The problem happens when Discourse attempts to download images that have been added to the post. If downloading the remote image results in a markdown image tag wrapped in HTML tags, the image will be broken.

If posts are being published from WordPress, the issue should be solved by switching from using the Classic Editor to using the Block Editor for publishing the posts. If this is not possible, or if it’s not resolving the issue, a workaround for the problem is to prevent Discourse from downloading the remote images.

If know the domains that the remote images are being published from, you can prevent Discourse from downloading these images by adding the domain(s) to the disabled image download domains site setting:

If you are unsure of all domains that are being used, you can prevent Discourse from downloading all remote images by disabling the download remote images to local. Note that disabling this setting could result in broken images on your site. If possible, it is better to only prevent downloading of remote images from specific domains that you control.

3 Likes

May I ask for more details concerning this? All my sites use the Classic Editor, but very few use a plugin to render markdown as input (the plugin space dried up in markdown parsers, so folks reach for Jetpack most times).

Is the case when a markdown parser is used atop of the Classic Editor? :thinking:

The issue happens when HTML in the following form gets posted to Discourse. It’s most likely to occur when a topic is posted to Discourse via the API:

<p><img src="remote-image-domain/..."/></p>

Any outer tags around the image tag will cause the issue, for example <figure><img src="remote-image-domain/..."/></figure>

When discourse attempts to download the remote image, the following markdown would be generated for the first example:

<p>![](upload://6zqK52dO23i1JsYH2oyMU12U2ro.jpeg)</p>

This will create a broken image. It can be fixed manually be editing the Discourse post to:

<p>

![](upload://6zqK52dO23i1JsYH2oyMU12U2ro.jpeg)
</p>

but just preventing Discourse from downloading the remote image with the disabled image download domains site setting is an easier way to fix it.

For posts published from WP Discourse with the Block Editor, the plugin attempts to fix the issue by processing the post with the following code before publishing it to Discourse:

It might be possible to implement a similar fix with for the Classic Editor, but with the Classic Editor the WordPress parse_blocks function isn’t available, so the fix would be more complex. My hope is that the issue can eventually be taken care of with changes to the core Discourse code.

1 Like

Thanks so much Simon! I understand the issue, great explanation. :slight_smile:

1 Like

Hi Simon,
Thank you for making WP Discourse. :slight_smile:

I also had this problem with images. I use this for download images locally and that broke images as you explain above. After that i convert the Wordpress HTML to Markdown and paste the converted to Discourse. It is working fine but it’s manually.
Is that possible to integrate converter to make it automatically when export from Wordpress?

Thank you!

1 Like

If you are using the WordPress block editor for publishing posts, the conversion should happen automatically. If you are using the Classic Editor, you’ll need to manually fix the HTML on Discourse to prevent broken images.

Let me know if you are using the Block Editor, but are still having issues with broken images.

It could be possible to add similar functionality to posts published with the Classic Editor, but the code required to do it would be more complex than what’s being done with the Block Editor.

1 Like

I using the block editor (Gutenberg) but there are some 3rd party plugin installed into it. Maybe that causes the issue with broken images. I use some 3rd party gallery plugins as well on Wordpress.

The gallery plugin could be the cause of the issue. What the WP Discourse plugin is doing is before setting the post content that gets published to Discourse, it looks for any blocks in the post that have their blockName set to core/image or core/gallery. HTML for images in those blocks is rewritten into a form that can be parsed by Discourse.

It seems possible that image plugins used on your site may be using block names that are not being handled. What is the name of the gallery plugin you’re using?

I see… I am using this but i just see now this is already unsupported. So i think i will convert the images back to the default gallery and try to update the Discourse topics. This should be the problem sorry about it.

1 Like

I switched to the Block editor (it has to be done at some point since the classic editor support will end next year), but it didn’t fix the issue. The images were hosted on Facebook.

Are you able to check the image markup on the WordPress post by selecting the ‘Code editor’ from the sidebar? What I’m wondering is what kind of block (if any) the images are in:

The WordPress plugin is using block names to parse the images. If the image isn’t in a block that the plugin is currently handling, its markup won’t be cleaned up.

1 Like

The WP post was a copy-paste from Facebook, here’s a sample of the HTML code.
The images were image emojis:

<div dir="auto"><span class="pq6dq46d tbxw36s4 knj5qynh kvgmc6g5 ditlmg2l oygrvhab nvdbi5me sf5mxxl7 gl3lb2sf hhz5lgdu"><img src="https://static.xx.fbcdn.net/images/emoji.php/v9/t34/1/16/1f914.png" alt="🤔" width="16" height="16"></span>Comment ? Vous avez 1 mois pour nous envoyer vos plus beaux poèmes et/ou dessins sur le thème du monocycle, ce qu’il vous évoque, votre passion pour ce sport, etc.</div>

I don’t have the same sidebar as you in the block editor, so I displayed the block HTML content with this option:

If the issue happens because it’s not “regular” WP content but a HTML copy paste, that’s not an issue. I’ll tell my users to avoid copy pasting images, even emojis. :slight_smile:

1 Like

Yes, I think the issue here is that the HTML was copied into the WordPress post. The WP Discourse plugin should be able to handle images that are added through an image block. It’s not setup to fix the HTML for images that are added in any other way.

Ideally, DIscourse would be able to handle HTML image tags that are wrapped in other HTML tags, but it’s a tricky problem. Possibly the WP Discourse plugin can be updated to handle images that are added outside of image blocks. My hope was that dealing with image blocks would cover most cases, but there seem to be a lot of exceptions to that.

3 Likes