PdfOnebox (part of onebox) gives RuntimeError when pdf is hosted on poorly configured SSL


(Tim Diggins) #1

Given:

If you have a piece of pdf content hosted* on a non-ssl site and
the ssl version of the site gives a ssl error (e.g. err_ssl_unrecognized_name_alert), and
you copy your link into a onebox-enabled site (e.g. discourse, I think, but I was using thredded: see thredded/thredded#682, I’ve also validated that this occurs in the current master of onebox with rake server).

I would expect it to either retrieve the pdf correctly and onebox it, or
attempt to retrieve it by https and fail and then just present a non-oneboxed link, or
(worst case) just show a onebox failure

but instead:

it throws an error (to be caught or not by the webapp using onebox).


So I think the conversion of http -> https is by design always_https in onebox/pdf_onebox.rb at 6b5b53b26e7bb36b9e3e6cedce5db9172692c851 · discourse/onebox · GitHub – to prevent mixed content errors.

traceback: traceback for oneboxing failure of https error · GitHub

However then I think onebox ought to deal with the https failure.
Not 100% sure how to proceed (I guess it’s possible to create a test case, it’s just a bit complicated).


(Jeff Atwood) #2

This seems like such a rare case, and not our bug since the problem is with busted https?


(Tim Diggins) #3

Firstly it’s not a busted https to the end user who expects to post a link to an insecure but working http address. PdfOneBox just changes the link to https without as it were consulting.

I think it might happen for any pdf link that is being served by a server that is not intended to serve on https but does respond on 443.

But the real problem is that it brings down the web page using onebox. I think (but haven’t tried) I could post a link which would cause a 500 on a topic (here on meta.discourse.org). (I have validated this with other web apps using onebox)

I will try to create a PR if I have time. (The only trickiness is simulating the broken https)


(Tim Diggins) #4

I’ve had a go at testing for it, but in my tests a 404 with valid cert (e.g. https://meta.discourse.org/nonexistant.pdf) is giving the same result as an SSL negotiation error (e.g. https://expired.badssl.com/some.pdf). However in onebox mounted in an app, i’m getting a normal onebox for a 404, and an RuntimeError for the SSL negotiation error, whereas in onebox rake server both of these are giving the same error. Bit of a mystery.

Anyway the answer for me, may be to just defensively guard against onebox runtime errors that it may percolate up (Protect against onebox errors by timdiggins · Pull Request #683 · thredded/thredded · GitHub). I haven’t looked, but maybe that’s what discourse does too (the bad urls don’t break this discourse’s preview, but I don’t want to put them on one line just in case)