"Onebox Assistant", crawl for those previews reliably!

No problem. Let’s walk through this again.

In a vanilla Discourse, to create a onebox preview, a Discourse server must be able to see and retrieve the “og” tagged data in the target page meta information. This is essentially a crawl.

To achieve this the target web server and its gatekeepers must not block this crawl.

It appears that meta is not permitted to see that page by its gatekeepers.

This is why you would consider using this plugin in the first place.

This plugin uses an API to return the target page instead of crawling directly.

This takes advantage of activities and management that the API provider does to significantly increase the chances of being allowed to see the content (e.g. using a farm of servers with high reputation IP addresses from which to launch the crawl, or possibly even faking a view as if it came from the desktop).

Unfortunately in this case, it appears even embed.rocks does not have rights to crawl that page and therefore in this case this plugin is not helpful.

However, if you raise this issue to embed.rocks support teams attention they may be able to work out ways in which to resolve this blockage.

In general this plugin should be useful as it should provide a better onebox previewing experience than a vanilla Discourse without it, though of course you have to pay the API provider.

Note, other things might be going wrong here, for example the target page might not have a good thumbnail.

However, you can use Facebook’s debug tool to explore the data and I believe it looks good:

Which points to the issue being with embed.rocks atm.

On a side note, the rise of Generative AI may be making content platforms much more careful about who can see their content for fear of their content ending up in someone’s model for free.

I hope that’s clear.

3 Likes