Wikipedia oneboxing of articles containing unusual characters in the URL

If you link to Wikipedia by reference, then the link works.

Филиппов, Михаил Михайлович (учёный)

If I copy the address from the browser line:

Филиппов — Википедия,%D0%9C%D0%B8%D1%85%D0%B0%D0%B8%D0%BB%D0%9C%D0%B8%D1%85%D0%B0%D0%B9%D0%BB%D0%BE%D0%B2%D0%B8%D1%87_(%D1%83%D1%87%D1%91%D0%BD%D1%8B%D0%B9)

Original address

https://ru.wikipedia.org/wiki/%D0%A4%D0%B8%D0%BB%D0%B8%D0%BF%D0%BF%D0%BE%D0%B2,_%D0%9C%D0%B8%D1%85%D0%B0%D0%B8%D0%BB_%D0%9C%D0%B8%D1%85%D0%B0%D0%B9%D0%BB%D0%BE%D0%B2%D0%B8%D1%87_(%D1%83%D1%87%D1%91%D0%BD%D1%8B%D0%B9)

Perhaps the error is not common, but in the last 2 days, users noticed that some Wikipedia articles (Russians) are processed in a similar way.

Perhaps because there is a comma in the link?

2 个赞

Same issue with ASCII-only titles containing commas:

I - Wikipedia,Robot(film)

https://en.wikipedia.org/wiki/I,_Robot_(film)

3 个赞

The autolinker avoids certain extreme edge cases by design, last time I mentioned this to @Vitaly the general recommendation is to use <.......> here for terrible edge cases which allows you to workaround this. Does not work with one box though.

Current workaround is to swap , with %2c

https://ru.wikipedia.org/wiki/%D0%A4%D0%B8%D0%BB%D0%B8%D0%BF%D0%BF%D0%BE%D0%B2%2c_%D0%9C%D0%B8%D1%85%D0%B0%D0%B8%D0%BB_%D0%9C%D0%B8%D1%85%D0%B0%D0%B9%D0%BB%D0%BE%D0%B2%D0%B8%D1%87_(%D1%83%D1%87%D1%91%D0%BD%D1%8B%D0%B9)

2 个赞

Also weird when the Wikipedia URL ends in an exclamation point.

https://en.wikipedia.org/wiki/Top_Secret!

You have to URL encode that as well…

https://en.wikipedia.org/wiki/Top_Secret%21

1 个赞
  • linkify-it 使用启发式方法,因此永远无法保证 100% 的置信度(即使是 99.99% 也不等于 100%)=> 它需要一些标记(当前是 \u003c..\u003e)来强制链接边界。
  • 您的 onebox 也需要一些标记来强制开启/关闭。

所以:您有两种独立的处理模式 => 需要 2 个独立的标记/标志来定义。

目前,您只有单个标记用于 2 种模式。这是一个逻辑冲突。在我的项目中,我通过以下方式解决了这个问题:

  1. 允许将链接转换器应用于自动链接(\u003c...\u003e - 此标记)。
  2. 在编辑器选项中添加了复选框“禁用链接扩展”(针对每个帖子)

虽然不是理想的解决方案,但对我来说还可以。也许您可以为您的案例发明一种更好的方法来添加第二个标记/标志。

3 个赞