Recent Onebox issues

Our users have been complaining that Onebox is broken for some sites recently, including New York Times, Washington Post. Did Onebox change recently? See links below. First one is a gift link.

https://www.nytimes.com/2024/07/10/magazine/food-documentaries-health.html?unlocked_article_code=1.6U0.pZE2.lK4MeFMWelpV&smid=url-share

https://www.nytimes.com/2024/07/10/magazine/food-documentaries-health.html

https://www.washingtonpost.com/wellness/2024/07/16/nonstick-pans-pfas-teflon-flu/?itid=hp_ts-1-sallys-mix_p001_f008

https://www.washingtonpost.com/wellness/2024/07/16/nonstick-pans-pfas-teflon-flu/

1 Like

Has onebox ever worked with paywall :thinking:

1 Like

I’ve noted a number of straight url’s posting recently, figured it was the sites but now am wondering if its more than that :man_shrugging:

1 Like

I recently added support for private GitHub oneboxes, but that shouldn’t affect other sites. Generally we cannot onebox paywalled / private sites:

These links don’t show for me either but the error shows in the preview:

https://www.nytimes.com/2024/07/10/magazine/food-documentaries-health.html

https://www.washingtonpost.com/wellness/2024/07/16/nonstick-pans-pfas-teflon-flu/?itid=hp_ts-1-sallys-mix_p001_f008

3 Likes

I have recently noticed some odd behaviours on Stable. When I post links from my other discourse instanced(Tests-passed). Sometimes seemingly at random the link doesn’t always onebox.

I haven’t tried posting links from my stable on the tests passed forum.

I have tried rebuilding hmtl with no success getting the link to one box.

I think iirc there is another topic here(on Meta)where I posted a SS.

1 Like

We need clear examples of links which you think should onebox and didn’t to act.

2 Likes

This New York Times link doesn’t onebox today:
https://www.nytimes.com/2022/03/10/dining/chowhound-closing.html#commentsContainer

But it was oneboxed in 2022 on our forum. See the first post.

Similarly, this Washington Post link doesn’t onebox today:
https://wapo.st/3J0aTO8

But it was oneboxed in March this year on our forum. See the 33rd post.

2 Likes

It looks like it’s behind a paywall now

2 Likes

New York Times and Washington Post have always been paid publications. Though I don’t know if they have done anything recently to change their paywall structure.

Though if I may make a suggestion- if the paywall is the issue, and if one can visually see the article title and caption in the paywalled page, shouldn’t onebox be able to capture those info?

3 Likes

Yes I don’t really disagree with that, I looked at the source of the page and it feels like we have enough info to show something.

2 Likes

New York Times started paywall 2011. But it allowed some reading times without registration and credit card some times, five if I recall right. At same time it allowed Google’s browsing. Much newer system is blocking access totally and after fighting with Google they shut down free reading totally.

Could breaking of oneboxing happend at same time?

2 Likes

How can I determine what that is for myself?

I have a site that does not onebox, can I add anything to it so it does?

1 Like

You should read this topic: Configuring and troubleshooting oneboxes

3 Likes

Here is a link to my Stable Discourse forum with links from my other discourse instance running Tests-passed

Here are pics in that topic 1 xrtropolis of no not one boxed while later posts one box fine

1 Like

Would this be something the team will look into adding for Onebox- including available details for paywalled sites?

95% sure the onebox already does that. If there’s enough information to display a onebox, it sure will, even if the content is ultimately paywalled.

What I think happens is that the onebox is getting denylisted by these paywalled websites due to recent LLMs crawlers/agents so it doesn’t see the same HTML we might see when using a browser.

Though, happy to get proven wrong. If someone wants to have a quick look to see if they can improve it somehow, pr-welcome :wink:

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.