Download images for oneboxes as well, if download images is set


(Jeff Atwood) #7

I assure you no average person cares or knows about https mixed content warnings, which are very subtle. Whereas the page bouncing around like an earthquake just hit is irritating and plainly visible to all users.


(Sam Saffron) #8

Well, we need to download the images to get the proper sizes, so this is part of the problem


(Jeff Atwood) #9

Not really, in the case of github the avatars are all fixed size anyway. Just specify a fixed size in the onebox.


(Régis Hanol) #10

That is now done :wink:

https://github.com/discourse/onebox/commit/aea7c578340f82eb9da8db70b211ec32e7b47276


(Sckott) #11

:+1: for this features, based on my related question Don't load http onebox images when using https


(mountain) #12

That is not wise. At all.

Part of the SSL ‘industry’ includes educating consumers about the look of the padlock. Some corporations and business entities will go the extra length to get the certificate that includes their name.

Google is also using HTTPS as a deciding factor for page ranking.

So says one Google employee via Hacker News:

That Google actually checks fully for SSL errors now is irrelevant. What is relevant is someday they may do just that and more to make sure SSL certs are not purchased just for pagerank. They are already checking for mixed context in some form.

With all of these factors, you are only setting yourself up for a massive headache when Discourse-backed sites are penalized because of OneBox use.

You cannot bank on every single person that connects to a Discourse instance to not notice the broken padlock. I have noticed it. I know it also relates to OneBoxed youtube videos because I tested here on Meta.

A broken padlock = zero integrity. For an online business that means everything. Especially if they purchased the SSL just to have the padlock in the first place.

I implore you to reconsider. If not for everyone else then at least to mitigate. One ounce of prevention is worth a pound of cure.


(Jeff Atwood) #13

The browsers can’t tell when things are built and downloaded via JavaScript DOM changes versus initial http page loads.

e.g. when I go from “page” to “page” here there is an incorrect yellow padlock mixed content warning on this page which is not correct. It reflects previous page state. If you hard refresh this “page”, it shows green padlock.

Basically the padlock state only reflects the “page” that was initially loaded… navigating to a different “page” (via JavaScript, not an actual page load), does not reset the state of the padlock indicator…


(mountain) #14

Even then that still does not change the fact we all have the ability to use our peripheral vision to notice the padlock is broken.

EDIT: Also, just because pageviews are generated for a user on the fly does not mean the static html will not have an error. Bots read html and that html will have the mixed content as well.

I did a security run to fact check your reply.

If this site detected it despite the dynamic per-user nature of the front end, then google will too.

Again, these errors will possibly penalize a site in some way. If not now, the future is uncertain.

My initial statement still stands.


(Mittineague) #15

I guess the solution here is to check if the Discourse forum is http or https and if not https don’t allow oneboxing http content. (with a “sorry” message would be best)


(mountain) #16

If local-caching is not available right now for this, I think you’re right. Or perhaps disable all media and allow text-only.

As for an admin-user who has SSL but insists on disabling local caching for all images (OneBox or otherwise), that is out of purview.

EDIT: If Jeff is saying all OneBox content is written on the fly per pageload, then what is actually shown for clients that do not have javascript set?


(Kane York) #17

I just noticed an easy solution for part of this. Imgur supports https, and it’s big enough that rewriting all i.imgur.com links to be https is probably worth the effort.


(mountain) #18

Yes, agreed. That is in relation to the idea of supporting any other OneBox sites that have https available. I know there’s also a few places that list all official https-carrying sites so browsers know to auto-ask for https first. Hooking into that may help if a user links media from a https site yet accidentally used http by mistake for whatever reason.


(Allen - Watchman Monitoring) #19

I love the idea of rewriting known URLs to https… but it seems to me that importing all the myriad of images, scripts, etc, which might be referenced in a onebox’d URL is a lot of stuff being imported.


(Erik Swanson) #20

Would it be reasonable/doable to add a knob that disables oneboxing in all cases except where it can be guaranteed to not cause a mixed-content warning?


(Gerhard Schlager) #21

The mixed content warnings were bugging me, so I looked at the code in order to fix this. Unfortunately there’s more than one reason why images within oneboxes aren’t downloaded. :frowning:

I’ll at least try to document my findings:

  1. As @zogstrip mentioned images inside oneboxes should be ignored. But they aren’t since the doc.css(".onebox-result img") doesn’t do anything. Is the onebox-result class still used? I saw only onebox and onebox-body when I included Github and whitelisted oneboxes.

  2. Discourse downloads all the images embedded in oneboxes, but many of the embedded images have no recognizable file extension.:

  • Gravatar used at Stackoverflow: https://www.gravatar.com/avatar/3dcae8378d46c244172a115c28ca49ce?s=128&d=identicon&r=PG
  • Avatar used at Github: https://avatars.githubusercontent.com/u/357?v=3
  • Discourse looks only at the file extension and since it doesn’t find a known image extension, it stores it as simple upload, not as image.
  1. If we are lucky and the image embedded in the onebox has a valid image file extension, we run into the last problem. Image URLs are only replaced within the post’s raw text, but the onebox images aren’t there. They are stored in cooked.

Links to Amazon cause a TLS mixed content warning
(Dan Dascalescu) #22

I have a similar problem in this post. It links to

which unfortunately redirects to the http:// version (they’re sorry for destroying the padlock on this Meta page…). The image is, though, available via https:

Would it be possible for the Onebox code to check if the image is also available via HTTPS, and if so, change the URL to HTTPS? No local downloading necessary.


(Ryan Mulligan) #23

Another possible solution would be to add a feature to GitHub - discourse/onebox: A gem for turning URLs into website previews to have a second whitelist like onebox/whitelisted_generic_onebox.rb at 34cd201b72b3c6fc8058267796424f245bda0f45 · discourse/onebox · GitHub that supports only sites that have HTTPS properly configured.

Would a PR for that feature be welcome?


(Pat David) #24

I notice that this still hasn’t been addressed (at least as of v1.7.0.beta7 +27).

Are there still any plans for local-caching assets that may be called by a onebox?

Right now the behavior is fine while browsing the topics until you stumble across a post that has mixed-content due to a onebox. The bigger problem is that once you’ve hit that post all subsequent browsing in that session will cease to show the green padlock.


(Ryan Mulligan) #25

That’s interesting that the tab maintains the mixed content warning!

I think there are actually plans to make this worse, because

:sparkles: New in Discourse 1.7 beta! All links will now be oneboxed and prettified, if you paste them on a line by themselves:

I asked if there were any plans to fix it:

any corresponding improvements to the mixed content SSL warnings caused by oneboxing?

and the response was

unfortunately we do not control all websites on the Internet :wink:


(Sam Saffron) #26

We should proxy all one box content locally where possible, I totally support a PR to improve this. Further more I support a PR for a site setting that disables unsafe embeds (it should be default on)