Blacklist vs whitelist onebox


(Eric Vantillard) #1

Continuing the discussion from Auto-discoverable oneboxer-able links based on whitelist configurable by admin:

I would like to enable onebox for every site that support oEmbed or Open Graph and be able to blacklist some sites in case of improper usage.


(Jeff Atwood) #2

This currently isn’t possible. You’d need to make changes to the onebox gem.


(Kane York) #3

Any implementation of this will likely result in your Discourse instance making spurious calls to non-existent websites while someone is typing, which is a bit wasteful.

The client already makes a lot of calls to /onebox, maybe they should be debounced.


(Eric Vantillard) #4

and would it be possible to create a plugin to enable OEmbed discovery ?

like this plugin for wordpress Enable oEmbed Discovery — WordPress Plugins

see oEmbed


(Sam Saffron) #5

I just hit this exact issue again today, this is something I would also like build.

In particular for internal Discourse instances there is zero reason to blacklist any sites. In general, I am not really sure what the value is of the whitelist approach. Facebook and Twitter seem to be coping just fine with a blacklist approach.

I guess the big question is:

What kind of abuse does the onebox whitelist only approach prevent? Honestly I am struggling really hard to think of any.

@eviltrout ? @codinghorror ?


(Jeff Atwood) #6

The idea was to be safe by default but relax it over time, as opengraph and oembed become more common on the web. I would still prefer to see mini oneboxer implemented before we do that though.


(Sam Saffron) #7

That is what I am struggling with, what is “insecure” opening the open-graph floodgates? I just can’t think of anything really, we already download images from arbitrary sources in our pipeline which is far more risky.

Not at all against mini-onebox, in fact this would tie in to mini-onebox work. Just struggling real hard to figure out what we are protecting against here.


(Robin Ward) #9

For the record I always thought it should be blacklist, not whitelist, but I also had no problem erring on the side of security. In early discussions I believe @codinghorror insisted it should be whitelist.


(Kane York) #10

Yeah I think only oEmbed needs whitelisting, and even then most of the oEmbed returns are unusable due to <script>s.


(Eugen Rochko) #11

Please change this from whitelist to blacklist. This is hurting the decentralized web. With the rise of Mastodon as an alternative federated social media platform I believe this becomes more important than ever. You can’t whitelist every potential Mastodon URL out there - there’s over 2000 different servers. All support OEmbed via discovery, since it’s an open standard specifically for this use case. But Discourse won’t work with any of them because of this choice.

Last time I checked Discourse sanitizes OEmbed html and strips out any scripts, allowing only pure iframes and other safe elements. So I do not see this as an issue defined by security risks.

@codinghorror @eviltrout


(Jeff Atwood) #12

This was changed some time ago and is already how it works today. Did you try it yourself?


(Eugen Rochko) #13

Is it behind a setting? I just tried it on my hosted Discourse and it didn’t work.

Excuse me while I try it right here:

Hm nope, doesn’t work - displays the OpenGraph tags instead. Before you ask - I did test the OEmbed discovery implementation on Mastodon’s side using other OEmbed tools.


(Jeff Atwood) #14

Looks correct to me. I see zero issues.


(Kane York) #15

They’re saying that the oEmbed format would be preferred, and oembed is still on an (extremely strict) whitelist (because a lot of oembed stuff is broken, so Discourse preferring it by default would result in a lot of broken stuff).