Oneboxing blocked by robot check

I am seeing this on a sites and it just started. When Discourse tries to pull the information from the site it is blocked. THis use to work in prior versions.
I have included a link as an example

image

Bloomberg - Are you a robot?

2 Likes

It looks like rate limiting by Bloomberg. There likely isn’t much you can do other than infer what the limits are and contrive to stay under them.

2 Likes

What exactly are you trying to onebox here? The URL is quite strange.

2 Likes

Bloomberg news article. If you click on the link. Thahs the article.

1 Like

Try "Onebox Assistant", crawl for those previews reliably!

Works with Bloomberg links iirc.

What’s the original link? The one you’ve pasted above is not an article, it’s a destination you’ve been redirected to I suspect.

2 Likes

https://www.bloomberg.com/opinion/articles/2020-01-29/peer-review-is-science-s-wheel-of-misfortune

This is the link.

2 Likes

I see, here’s the link

http://www.bloomberg.com/opinion/articles/2020-01-29/peer-review-is-science-s-wheel-of-misfortune

Looks like they have pretty aggressive anti-scraping in place, since all we’re doing is checking for metadata headers…

Also yet another example of where we shouldn’t be oneboxing at all because we have neither image nor description cc @techAPJ @sam … we really gotta backport that change to stable once it goes in next week.

2 Likes

I just tried the link ending up to the html extension (minus all the trailing characters) just using Firefox, not Discourse Onebox. The error extended error message is below the line. The 1st link (which has the error msg below) is enclosed in <> here. The second link is without being enclosed in <> and gives the title of the URL as shown.
https://www.bloomberg.com/tosv2.html
Bloomberg - Are you a robot?


We’ve detected unusual activity from your computer network

To continue, please click the box below to let us know you’re not a robot.

Why did this happen?

Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of Service and Cookie Policy.

Need Help?

For inquiries related to this message please contact our support team and provide the reference ID below.

Block reference ID: 13215fd0-4285-11eb-8faf-b7e9262e99b2

1 Like