Broken links (blank page) with invalid URLs with double #

frabrunelle · November 7, 2016, 8:37pm

Earlier today, I posted a topic and I noticed that there was an issue with some of the links:

The first two links inside that topic were broken:

https://riot.im/app/#/room/#safenetwork:matrix.org
https://riot.im/app/#/room/#safenetwork-dev:matrix.org

If you right click on the link and open it in a new tab, it works fine. But if you do a normal click on the link, it brings you to a blank page with the following URL: https://safenetforum.org/clicks/track?url=https%3A%2F%2Friot.im%2Fapp%2F%23%2Froom%2F%23safenetwork%3Amatrix.org&post_id=105936&topic_id=11711

I found a temporary fix: if I link to https://riot.im/app/#/room/%23safenetwork:matrix.org (%23 instead of #) it works fine! but I would still like to be able to use the normal link if possible thank you!

codinghorror · November 7, 2016, 9:56pm

Is there anything we can do here @eviltrout?

eviltrout · November 10, 2016, 4:42pm

Hmmm, the problem here seems to be that Ruby can’t parse those URLs:

URI.parse("https://riot.im/app/#/room/#safenetwork:matrix.org")
URI::InvalidURIError: bad URI(is not URI?): https://riot.im/app/#/room/#safenetwork:matrix.org

It honestly seems a bit odd to me to have two hashes in a URL like that.

Stefan_Fairphone · March 31, 2017, 5:07pm

It honestly seems a bit odd to me that I can’t forward our forum population to our Matrix chatroom. Seriously: If Firefox can open those links, Discourse/Ruby should be able to do it as well.

Falco · March 31, 2017, 5:12pm

Is this a valid URL per the RFC? If not, it’s odd that a new protocol is based on invalid URLs.

Also, it should work if you encode according to the spec: Element

eviltrout · March 31, 2017, 5:12pm

According to this stack overflow answer a fragment is not allowed to contain #, and so Ruby’s internal URI processor is correct.

It’s cool that Firefox allows it, but it would be quite a lot of work to audit our codebase and find all the places where we parse URIs and allow for this special case.

sam · March 31, 2017, 5:14pm

It is an invalid URI you have there, riot should be the one fixing this strange scheme

Is there a bug report somewhere in the relevant riot issue tracker you raised?

EDIT: triple ninjad here…

Issue tracker for riot is https://github.com/vector-im/riot-web

Stefan_Fairphone · March 31, 2017, 5:19pm

It’s not Riot itself, but actually a Matrix convention:

https://matrix.to/#/#wearefairphone:matrix.org

Will try to raise the issue with them.

Edit: You can follow the discussion here:

https://matrix.to/#/!cURbafjkfsMDVwdRDQ:matrix.org/$14909817704630VDcMZ:disroot.org

~~(Copy the link to your browser address bar. )~~ Direct link to a message does work in Discourse, apparently…

ara4n · March 31, 2017, 5:42pm

Hey, project lead for Matrix here. Yup, we’re aware that it’s a bit naughty to put unescaped #'s in URL fragments, but in practice this is the first time in two years we’ve seen anyone actually hitting it as a bug! The workaround is of course to escape it properly as %23.

We’ll have a think about the right way to fix this - perhaps we should switch to using Unicode Character 'VIEWDATA SQUARE' (U+2317) instead O:-)

In the interim it might be kind to consider tweaking your parser to be slightly more forgiving on what it accepts however, in accordance with Postel’s law.

In other news, I can’t wait for someone to implement a Discourse<->Matrix bridge, rather than clunky hyperlinking between the two!

Falco · March 31, 2017, 5:45pm

The problem is that the parser in question is the Ruby Standard Library one.[quote=“ara4n, post:9, topic:52640”]
In other news, I can’t wait for someone to implement a Discourse<->Matrix bridge, rather than clunky hyperlinking between the two!
[/quote]

This will be way easier when this plugin is live:

ara4n · March 31, 2017, 6:12pm

another workaround might be to use the intended minimal URL form for matrix.to URLs - You're invited to talk on Matrix is meant to work and ‘do the right thing’ (but isn’t implemented yet: github.com/matrix-org/matrix.to/issues/10).

However, it also commits a slight naughtiness by hoping to produce user links that look like https://matrix.to/@matthew:matrix.org. Technically you’re meant to escape @ symbols in URL paths too. Can someone confirm if Ruby’s URL parser chokes on the unescaped @?

cpradio · March 31, 2017, 6:18pm

The link works here without throwing an error, so it seems to support it.

ara4n · March 31, 2017, 6:25pm

fab. in which case i think we’ll have a viable workaround once someone fixes https://github.com/matrix-org/matrix.to/issues/10 - and meanwhile https://github.com/vector-im/riot-web/issues/3550 has been filed to fix Riot’s native URLs.

codinghorror · March 31, 2017, 9:01pm

Honestly the best thing is for you guys to fix your stuff, which you’re here and doing, so cheers for that.

ara4n · March 31, 2017, 9:04pm

for sure. just seems a bit weird that the Ruby parser is being more unforgiving than any other one we’ve encountered. I’m pretty sure we’re not the only people out there who cheat on escaping some inoffensive URL characters in order to make their URLs look prettier… I wonder whether the story here is actually that the Ruby parser just needs a greedy matcher when looking for #'s or something.

edit: eitherway, we’re fixing it, as you say

Topic		Replies	Views
Custom Header Links refuses Matrix URL Bug custom-header-links	1	36	February 15, 2025
"http" gets parsed incorrectly in posts Bug	13	1864	February 6, 2015
`Topic#featured_link` containing more than just a valid URL Bug	5	798	December 4, 2017
Onebox breaks if there's chinese text in URL Bug	14	1717	September 27, 2017
Bug(s) in Discourse handing of URIs in markdown content Bug	6	796	December 15, 2022

Broken links (blank page) with invalid URLs with double #

Related topics