"Onebox Assistant", crawl for those previews reliably!

merefield · January 24, 2019, 12:16pm

What it does

Turns this kind of result:

(where your server has failed to bring back the page source so cannot extract the required tags to build the onebox)

Into this!:

It simply provides an alternative path for onebox to get its page source with which to look for meta-data when the target server refuses your connection.

It changes nothing about how onebox then processes the page source to find the meta-data and render the box.

It’s meant to allow you to enter the details and credentials of a third party API to bring back the page instead of doing a normal http call directly to the target page.

Why

I found my servers were being forbidden access to a number of commercial sites so oneboxes would fail to be rendered. It essentiallly helps leverage the trustworthiness of the 3rd party API, a bit like a mail service.

Why it’s cost effective

You can use a relatively cheap VPS but still get reliable one-boxing functionality, even if your IP or user agent is somehow ‘blacklisted’.

You don’t need it if

You are oneboxing all your target content ok with the vanilla install and all users are happy

Pre-requisites

You need an account with a suitable 3rd part API.

Settings

onebox assistant api base address:  https://api.embed.rocks/api/

Above example uses embed.rocks, but in the future support for other API’s might be added, however, embed.rocks is relatively good value atm.

onebox assistant api base query:   ?url=

onebox assistant api options:   &skip=article,description,oembed,imextra&include=source

onebox assistant api page source field:   source

You will also need to enter your API key provided by embed.rocks

See example below

This setting allows you to ignore the prefetch (to check if the direct crawl returns a result) and use the API from the get-go.

default OFF

I recommend setting this to TRUE.

This is more expensive of course but often yields better results as there are some cases where the pre-fetch gets redirected to the wrong page because you are not trusted.

Support Information

Remember, if you’ve previously attempted to onebox a link, Discourse core will cache the result.

You can add a random querystring on the end to overcome the cache: https://mylink.com/todaynews?random=random

You can also check the API is responding with, e.g.:

curl -X GET "https://api.embed.rocks/api/?url=https%3A%2F%2Fnews.bbc.co.uk%0A&skip=article,description,oembed,imextra&include=source" -H "x-api-key: %%%your-api-key%%%"

You need to url encode the site you are calling (the url parameter value) using some site like this (not vouched for!)

Known Limitations

It’s only been tested with one provider at the moment and not tested on others. That provider is https://embed.rocks (with whom I have no affiliation). I’m happy to consider supporting more services if the work is sponsored.
The monkey patching is done at method level. This overrides more code than it needs to which leads to a greater risk of the plugin breaking after a core update. However I don’t think there’s a way to minimise this further?

How to install plugins

See the guide here: Install plugins on a self-hosted site

This repo is: https://github.com/merefield/discourse-onebox-assistant

All feedback welcome. Please it on GitHub if you find it useful.

Topic		Replies	Views
Youtube videos onebox embedding stopped working Support	49	6374	April 5, 2021
Onebox issue with a specific site Support onebox	14	1626	March 2, 2019
YouTube URIs fail to render thumbnails when oneboxed Bug composer , onebox	12	202	November 14, 2025
Vimeo Embed not working on my site due to Vimeo server IP blacklisting Support	51	12512	August 31, 2023
Youtube embeds missing Support	51	3974	June 8, 2024