Cache onebox images and/or serve them from the main domain

Due to European privacy regulations, it’s best to avoid request to third party domains initiated by a website you host. It’s hard to achieve compliance with Discourse, as the Onebox feature makes the browser fetch the thumbnail from the original website, making it a third-party request. See the below oneboxed article:

You can see in the devtools that the image is downloaded from a third party website. As the article also points out, it’s a GDPR issue even if the request has no cookies.

Currently I disabled onebox for my instance of Discourse, but I’d love to introduce it back in a way that more strictly respects privacy and isn’t opening a way to get a fine

The images could be served from a main domain, or a custom domain like cdn.mydomain.com

You can’t save third party images on your own setup because of copyright that is really hard with images (and that makes one feature of Discourse quite questionable, but that is another story). But oneboxing isn’t against GDPR and has never been. To be safe side it is wise move to tell that there is third party involved via linkings and following GDPR is theirs responsibility. But you don’t need that either.

2 Likes

Meanwhile Discourse has already silently grabbed the image used in the onebox here:

4 Likes

Oneboxing itself isn’t but serving third-party resources clearly is, see the Google Fonts case above

Maybe we could instead proxy them through the discourse server, so the third parties would only see the IP of the Discourse server, not the user’s?

Huh, that’s interesting! Is it some kind of plugin or a setting that I could enable on my instance?

1 Like

Oneboxing isn’t serving third party service. And GDPR allows using Google Fonts, it must tell to users — and Google is following GDPR too.

But this isn’t GDPR matter at all. Iframes are.

It’s not that much about “telling” the users, as it is about having a legal basis, like Consent or Legitimate Interest. With consent you have to have a convenient way to refuse consent and then the resource isn’t loaded. With a Legitimate Interest you have to conduct a Legitimate Reason Assesment, and I (and I can imagine many other Discourse host administrators) would rather not do that and instead either disable oneboxing altogether, or hide the user’s IP behind a proxy

It is exacly about telling. You need consent when collecting, using and storing personal data of an user. And even then you must tell what you do, why you do and how long you do.

Oneboxing isn’t part of that, but you must tell that there if there is third party involved. And oneboxes are just fancy links.

3 Likes

What I meant is: disclosing the fact that third party requests isn’t enough. Under GDPR, you have to provide:

  • the purpose of such data processing
  • the legal basis of such processing

We cannot use consent, because Discourse does not allow for showing external images in onebox only for users who gave consent for that. And Legitimate Interest still requires engaging a lawyer in the process to properly conduct the balancing test. It would be so much easier if we just didn’t have to do that because the software didn’t make the browser send requests to third parties (which we cannot even list in a privacy policy, or point to the privacy policies of those third parties, because it can be literally anyone on Internet)

As seen by GDPR, they are not just links, because they are visited by the browser without user’s action.

(BTW I’m not pulling this from thin air, I work with GDPR compliance in my professional life and have practical experience with how the complaints and fines work in EU)

Legal discussions aside, there are users who block third party resources specifically for the purpose of not revealing their IP and browsing habits to third parties. It would be nice if oneboxes worked for them :heart:

Legal considerations aside, I think it would be useful to do so something in this area.
However, I tend to say that the solution is rather asking the user for a specific opt-in to display external media. A number of German websites I visit have adopted the practice of specifically asking users for this (example below).
I think on top of this, having static image previews served from the Discourse host directly would be nice, but tend to say the actual opt-in is the more important step.

Example (Golem.de, German IT news website):

Article: Minecraft-Version des Minecraft-Filmtrailers erobert Herzen

Without external media opt-in:

With external media opt-in:

2 Likes

I am curious on this. I can understand the Google Font issue as it requires actually fetching the font from Google’s Server for each user.

However in my XP with a “Onebox” the Link’s preview remains static unless you rebuild the hmtl. What I mean by that. Is in my XP the onebox remains with the same preview info even if the source changes info on the page.

This has been handy when a site has drastically changed the page info. My onebox info preview remained unchanged without rebuilding the post via the wrench.

I had some old oneboxes remain intact even though the original linked site no longer even held the content linked.

Iframes on the other hand I believe update in the manner you identified as @Jagster mentions as more of a live preview upon viewing per user as it does with fetching Google fonts.

At least with my understanding. I am guessing you would also need to disable embeds as well otherwise?


I imagine a plugin could be made to add a feature like the one below.

I can see why my German friend here in Canada says best to Block the EU.

As said by @Firepup650 above, this is not the case at all, as Discourse automatically downloads and serves Oneboxed images, which is even a default setting.

You can even make this more strict by toggling block hotlinked media.

4 Likes

You don’t need consent for that.

GDPR not sees it that way. And users must take an action to visit there.

1 Like

The issue was Google’s, that didn’t follow GDPR. That was one part of the case why Google just got quite nice fine from EU. But it never wasn’t issue of those sites that used Google Font, Adsense etc. That’s why Google was in the trial, not those who offered those services.

1 Like

I understand the frustration - many things the EU does are well-intended, but not necessarily well-implemented and often complicate things for developers.

In this specific case I do like what people implemented, because it gives a simple, but meaningful choice to the user.
If this can be done in a plug-in, all the better. I assumed it would need to be done in core because it runs across all oneboxes.

Reasons or imaginary reasons start to be off topic. But those regulations are things that are pain points in the USA. And those regulations are needed because US giga companies didn’t respect users’ privacy at all, because there is no regulation in the USA. Or should I now say… because devs’ lives aren’t so complicated :rofl:

So, don’t thank the EU. Send thank you cards to Google, Amazon, Microsoft, X, Apple, etc. And at the same time, you can think about what is common among them.

BTW, I got a message that I should not let anyone under 13 years old into my forum. I will surely let them in, even from California, if they can read Finnish :stuck_out_tongue_winking_eye:

Over and out (why there isn’t mic drop emoji…)

2 Likes

Not arguing with the need for regulation - all I’m saying is that the step from “we and our 999 trusted partners will steal your data” to a large popup that says “click here to agree that we and our 999 trusted partners will steal your data” sometimes feels like a pyrrhic victory, and I can understand how it makes some people scratch their heads :slightly_smiling_face:

But coming back to the main topic:
@kuba-orlik, if this works for you, then I would suggest to reword the feature request into “Provide option to require user consent before displaying external media” (I would vote for it, then).
If you feel the current heading suits your request better, that’s also fine. In that case, I would probably open another one myself.

Well it goes to crazy extremes. Unless it changed or was distorted I recall reading that a company may need to honor a warranty on say hardware where the end user damaged it by dismantling and attempting a fix.

The bigger issue is that it has gone to extremes to protect people from sheer stupidity.

From my understanding a plugin could indeed accomplish this as it modifies elements of core on the server side vs a theme component that modified things on the client/browser side; tamper monkey scripts are kinda like browser side installed themes/components. I used to use a tamper monkey script on an old browser game called Fallen sword to modify how it orders items in the market.


Interesting side note didn’t receive notifications for this topic.

I am really not following the feature request here given we already have

block hotlinked media

And

download_remote_images_to_local

So if you want this behavior … set it to true and false.

Is the feature request for us to build a first class HTTP proxy into the Discourse product?

3 Likes

I didn’t know about these two settings! They seem to do the job just fine for now. Will test them out further, thanks!

1 Like