Allow list of internal hosts for scanning

eviltrout · June 13, 2017, 5:06pm

In the latest release of Discourse we added extra protection to prevent SSRF attacks. This new code ensures that links are only crawled if they are not on private networks, so if your server replies with an internal address for any host it won’t be crawled.

However, this is not always ideal, for example if you are running a couple of Discourses on a private network, they wouldn’t be able to onebox each other or crawl links to fetch topic titles and such.

To fix this, I’ve added a new site setting to whitelist internal hosts for link crawling and oneboxing.

Simply add your hosts to the allowed internal hosts site setting and they will be crawled even if they are internal.

You should be absolutely sure these hosts are safe to crawl before you do this: we won’t crawl on any ports except 443 and 80, but if you are running other web services on the same host it’s possible an attacker could create a onebox or link crawling request that would hit those services and change data.

codinghorror · June 13, 2017, 11:11pm

Note that this is only possible if the HTTP onebox request triggers some kind of action on GET or HEAD. That is, you have a page on your intranet like this …

https://internalsite.example.com/delete-all-our-data

… where simply visiting that URL (issuing a HTTP GET to it) would… delete all your data. Correct @eviltrout?

fefrei · June 14, 2017, 9:01am

There is also a (theoretical) information disclosure risk: If https://internalsite.example.com/show-super-secret-data returns a page that will onebox and where the onebox contains sensitive data, this data will be leaked. (This is pretty unlikely, because most sensitive internal sites probably won’t onebox at all, or without sensitive data in the onebox itself.)

codinghorror · June 14, 2017, 9:21am

It’s exceedingly unlikely that an internal page would onebox, don’t you think?

fefrei · June 14, 2017, 9:29am

Yeah, it is. But not impossible, and I refuse to not celebrate that Discourse has taken precautions against a possible issue just because it’s not very likely
An albeit unlikely, I think it’s fair to warn sysadmins about the risk before they whitelist a host.

eviltrout · June 14, 2017, 2:37pm

Yes, onebox will never POST or PUT data, so if your internal app is built properly and does CSRF protection and all that you are good. It’s more of a danger for those legacy PHP apps where GETs are mutating data.

I hope those are rare, but I could also understand a company saying “hey we know this 15-year old app sucks but it only runs on our internal network so who cares”

ssvenn · August 9, 2018, 7:46pm

This is a nice security precaution feature, but it would be great if the rails production log had some more debug text when it comes to why oneboxes are failing… something like “host X is on a private network but not whitelisted” or “opengraph meta tags missing” and so on.

I’ve been scratching my head about why internal oneboxes didn’t work all day until I found this explanation of the whitelisting setting. It wasn’t immediately obvious to me that the internal host whitelist required just the hostnames, without any http:// or url paths around it.

At least I learned something new about HTTP teapots after searching for that generic 418 code in the production.log

https://meta.discourse.org/t/oneboxing-failing-with-418/78629/2

Topic		Replies	Views
Internal oneboxing problem revisited Support	6	930	July 10, 2017
Internal URL no longer oneboxing Support	27	2776	June 12, 2017
Whitelist to allow onebox only with https domains? Support	7	1030	July 31, 2021
Blacklist vs whitelist onebox Feature	14	3467	February 11, 2019
Self Link previews not showing for private sites? Bug	15	3459	November 4, 2016

Allow list of internal hosts for scanning

Related topics