Wikipedia oneboxing of articles containing unusual characters in the URL

If you link to Wikipedia by reference, then the link works.

Филиппов, Михаил Михайлович (учёный)

If I copy the address from the browser line:

Филиппов — Википедия,%D0%9C%D0%B8%D1%85%D0%B0%D0%B8%D0%BB%D0%9C%D0%B8%D1%85%D0%B0%D0%B9%D0%BB%D0%BE%D0%B2%D0%B8%D1%87_(%D1%83%D1%87%D1%91%D0%BD%D1%8B%D0%B9)

Original address

https://ru.wikipedia.org/wiki/%D0%A4%D0%B8%D0%BB%D0%B8%D0%BF%D0%BF%D0%BE%D0%B2,_%D0%9C%D0%B8%D1%85%D0%B0%D0%B8%D0%BB_%D0%9C%D0%B8%D1%85%D0%B0%D0%B9%D0%BB%D0%BE%D0%B2%D0%B8%D1%87_(%D1%83%D1%87%D1%91%D0%BD%D1%8B%D0%B9)

Perhaps the error is not common, but in the last 2 days, users noticed that some Wikipedia articles (Russians) are processed in a similar way.

Perhaps because there is a comma in the link?

2 Likes

Same issue with ASCII-only titles containing commas:

I - Wikipedia,Robot(film)

https://en.wikipedia.org/wiki/I,_Robot_(film)

3 Likes

The autolinker avoids certain extreme edge cases by design, last time I mentioned this to @Vitaly the general recommendation is to use <.......> here for terrible edge cases which allows you to workaround this. Does not work with one box though.

Current workaround is to swap , with %2c

https://ru.wikipedia.org/wiki/%D0%A4%D0%B8%D0%BB%D0%B8%D0%BF%D0%BF%D0%BE%D0%B2%2c_%D0%9C%D0%B8%D1%85%D0%B0%D0%B8%D0%BB_%D0%9C%D0%B8%D1%85%D0%B0%D0%B9%D0%BB%D0%BE%D0%B2%D0%B8%D1%87_(%D1%83%D1%87%D1%91%D0%BD%D1%8B%D0%B9)

2 Likes

Also weird when the Wikipedia URL ends in an exclamation point.

https://en.wikipedia.org/wiki/Top_Secret!

You have to URL encode that as well…

https://en.wikipedia.org/wiki/Top_Secret%21

1 Like
  • Since linkify-it uses heuristic, it can never guarantee 100% confidence (even 99.99% != 100%) => it requres some marker (currently <..>) to force link borders.
  • Your onebox also requires some marker to force on/off.

So: you have two independent processing modes => need 2 independent markers/flags to define.

Currently, you have only single marker for 2 modes. That’s a logical collision. In my project i solved problem this way:

  1. Allowed to apply link convertor to autolinks (<...> - this markup).
  2. Added checkbox to editor options “disable links expand” (for every post)

Not ideal, but ok for me. My be you can invent better way how to add second markup/flag for your case.

3 Likes