When I do, I get the following in Chrome’s dev tools console:
https://www.myforumdomain.com/onebox?url=http%3A%2F%2Fwww.amazon.com%2FThe-Advantage-Organizational-Everything-Business%2Fdp%2F1491510803d&refresh=false
Failed to load resource: the server responded with a status of 404 ()
When I paste it here, it works… see?
I see this other topic referencing a bug fixed before, but that was a while ago:
I’ve just tested this on a live Discourse instance, one that is running a little behind latest v1.5.0.beta12 +135 - I will update and test again when I get chance… but…
Although I’ve white-listed the domain www.amazon.co.uk and restarted (not rebuild) Discourse I see the same results as @mcwumbly mentions in Chrome Dev Tools.
I’ve verified that I can connect to the URL:
root@forum:~# curl http://www.amazon.co.uk/Belkin-BSV103-SurgeCube-Protector-Charging/dp/B00P2GW7MG -o deleteme.html --verbose
* Hostname was NOT found in DNS cache
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 176.32.108.186...
* Connected to www.amazon.co.uk (176.32.108.186) port 80 (#0)
> GET /Belkin-BSV103-SurgeCube-Protector-Charging/dp/B00P2GW7MG HTTP/1.1
> User-Agent: curl/7.35.0
> Host: www.amazon.co.uk
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Sun, 27 Mar 2016 18:07:47 GMT
* Server Server is not blacklisted
< Server: Server
< pragma: no-cache
< x-amz-id-1: 1QKR29Z3J28QM1BPTS6V
< p3p: policyref="https://www.amazon.co.uk/w3c/p3p.xml",CP="CAO DSP LAW CUR ADM IVAo IVDo CONo OTPo OUR DELi PUBi OTRi BUS PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA HEA PRE LOC GOV OTC "
< cache-control: no-cache, no-transform
< x-frame-options: SAMEORIGIN
< expires: -1
< x-ua-compatible: IE=edge
< Vary: Accept-Encoding,User-Agent
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
<
{ [data not shown]
100 439k 0 439k 0 0 496k 0 --:--:-- --:--:-- --:--:-- 496k
* Connection #0 to host www.amazon.co.uk left intact
root@forum:~#
From the DO box itself I can curl this URL fine: http://www.amazon.com/The-Advantage-Organizational-Everything-Business/dp/1491510803
If I curl from within the docker container, though, I get a 503:
* Connection #0 to host www.google.com left intact
root@host:/# curl http://www.amazon.com/The-Advantage-Organizational-Everything-Business/dp/1491510803 -v -o deleteme
* Hostname was NOT found in DNS cache
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 54.239.17.6...
* Connected to www.amazon.com (54.239.17.6) port 80 (#0)
> GET /The-Advantage-Organizational-Everything-Business/dp/1491510803 HTTP/1.1
> User-Agent: curl/7.35.0
> Host: www.amazon.com
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
< Date: Sun, 27 Mar 2016 19:41:45 GMT
* Server Server is not blacklisted
< Server: Server
< Vary: Cookie,Accept-Encoding,User-Agent
< Last-Modified: Thu, 05 Nov 2015 22:42:39 GMT
< ETag: "562-523d2d81655c0"
< Accept-Ranges: bytes
< Content-Length: 1378
< Cneonction: close
< Content-Type: text/html
response body contains this at the top:
<!--
To discuss automated access to Amazon data please contact api-services-support@amazon.com.
For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.com/ref=rm_5_sv, or our Product Advertising API at https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.
-->
Tracing the code a more accurate request for testing is:
curl http://www.amazon.com/gp/aw/d/1491510803 -v -o deleteme -H "User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS
5_0_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A405 Safari/7534.48.3"
For some reason, the attributes in the image keeps changing. Note how it alternates between [data-a-hires, data-hires-replacement]and [data-a-dynamic-image, data-midres-replacement]
Why? I’m not sure and I’ve spent too much time trying to figure out why. I did manage to reproduce the different image tag attributes in my browser by loading the same product URL normally and in incognito mode.
I’ll wait a day or two before merging my PR below to see if anyone might know why…
O anyway there was another bug fix here… We’re getting a 404 because our code blows up if it can’t retrieve an image. I fixed that for now so even if the image is not found. We’ll still have a nice onebox without the image.