Some links are misinterpreted

Some links generated by Amazon are not correctly interpreted. Here is an exemple:
https://www.amazon.fr/Partition-intérieure-jazz-musiques-improvisées/dp/2907891030?_mk_fr_FR=%C3%85M%C3%85%C5%BD%C3%95%C3%91&dchild=1&keywords=partition+int%C3%A9rieure&qid=1625013895&sr=8-1&linkCode=ll1&tag=theoriemusicale-21&linkId=87d44c3dedd4b919e02195911a7a2b0d&language=fr_FR&ref=as_li_ss_tl

Here is a screen capture:

Here is the link:
https://www.amazon.fr/Partition-int%C3%A9rieure-jazz-musiques-improvis%C3%A9es/dp/2907891030?__mk_fr_FR=%C3%85M%C3%85%C5%BD%C3%95%C3%91&dchild=1&keywords=partition+int%C3%A9rieure&qid=1625013895&sr=8-1&linkCode=ll1&tag=theoriemusicale-21&linkId=87d44c3dedd4b919e02195911a7a2b0d&language=fr_FR&ref_=as_li_ss_tl

The problem seems to come from the double underscore

PS: Great book by the way :slight_smile:

@Vitaly is this something we should report on the markdown.it repo?

2 Likes

That’s still known issue, linkify doesn't accept urls with underscores · Issue #38 · markdown-it/markdown-it · GitHub, workarounded with <link> (angle brackets around link). Nothing new to report.

Still no deadlines. Good news is: i finally rolled out to production my new forum software https://rcopen.com/ (why a lot of resources been spent to create markdown-it). So, chances to fix this bug changed from “far infinity” to “some future” :slight_smile:

2 Likes

@sam, help required. I need info, can links from real world end with _, ~, - or +.

http://example.com/?sdf,wer- and so on.

That’s important to know before markdown-it's linkifier rewrite. Could you grep some huge postings databases for such cases and let me know if anything found or not found? Probably, those links were posted as <...> for workaround. So, regexp pattern for db scan will be something like /<http.+[+~_-]>/g (not tested)

I have no access to such big volume of markdown texts to reach acceptable confidence. May be you could help, or know anyone who could help?

We don’t have access to customer data, but on meta

- : medium rare (181)
_ : rare (58)
+ : very rare (8)
~ : almost never (2)

There is also this corpus you can query: https://data.stackexchange.com/ which may help you get more data (you can query Stack sites)

Interesting. Could you send me to PM all those links from meta db? Need to see visually

OR, if possible - link to postings, where those links were found

Thank you for info. Will take a look.

I believe this is most recently being tracked in:

I’ll close this off in favour of that one. :+1: