Broken links (blank page) with invalid URLs with double #

(Francis Brunelle) #1

Earlier today, I posted a topic and I noticed that there was an issue with some of the links:

The first two links inside that topic were broken:

If you right click on the link and open it in a new tab, it works fine. But if you do a normal click on the link, it brings you to a blank page with the following URL:

I found a temporary fix: if I link to Riot (%23 instead of #) it works fine! but I would still like to be able to use the normal link if possible :slight_smile: thank you!

(Jeff Atwood) #2

Is there anything we can do here @eviltrout?

(Robin Ward) #3

Hmmm, the problem here seems to be that Ruby can’t parse those URLs:

URI::InvalidURIError: bad URI(is not URI?):

It honestly seems a bit odd to me to have two hashes in a URL like that.

(Stefan Brand) #4

It honestly seems a bit odd to me that I can’t forward our forum population to our Matrix chatroom. :wink: Seriously: If Firefox can open those links, Discourse/Ruby should be able to do it as well.

(Rafael dos Santos Silva) #5

Is this a valid URL per the RFC? If not, it’s odd that a new protocol is based on invalid URLs.

Also, it should work if you encode according to the spec: Riot

(Robin Ward) #6

According to this stack overflow answer a fragment is not allowed to contain #, and so Ruby’s internal URI processor is correct.

It’s cool that Firefox allows it, but it would be quite a lot of work to audit our codebase and find all the places where we parse URIs and allow for this special case.

(Sam Saffron) #7

It is an invalid URI you have there, riot should be the one fixing this strange scheme :slight_smile:

Is there a bug report somewhere in the relevant riot issue tracker you raised?

EDIT: triple ninjad here…

Issue tracker for riot is :arrow_right: GitHub - vector-im/riot-web: A glossy Matrix collaboration client for the web.

(Stefan Brand) #8

It’s not Riot itself, but actually a Matrix convention:

Will try to raise the issue with them.

Edit: You can follow the discussion here:!$

(Copy the link to your browser address bar. :stuck_out_tongue: ) Direct link to a message does work in Discourse, apparently… :wink:

(Matthew Hodgson) #9

Hey, project lead for Matrix here. Yup, we’re aware that it’s a bit naughty to put unescaped #'s in URL fragments, but in practice this is the first time in two years we’ve seen anyone actually hitting it as a bug! The workaround is of course to escape it properly as %23.

We’ll have a think about the right way to fix this - perhaps we should switch to using Unicode Character ‘VIEWDATA SQUARE’ (U+2317) instead O:-)

In the interim it might be kind to consider tweaking your parser to be slightly more forgiving on what it accepts however, in accordance with Postel’s law.

In other news, I can’t wait for someone to implement a Discourse<->Matrix bridge, rather than clunky hyperlinking between the two!

(Rafael dos Santos Silva) #10

The problem is that the parser in question is the Ruby Standard Library one.[quote=“ara4n, post:9, topic:52640”]
In other news, I can’t wait for someone to implement a Discourse<->Matrix bridge, rather than clunky hyperlinking between the two!

This will be way easier when this plugin is live:

(Matthew Hodgson) #11

another workaround might be to use the intended minimal URL form for URLs - [matrix] is meant to work and ‘do the right thing’ (but isn’t implemented yet: Support links without the # anonymiser · Issue #10 · matrix-org/ · GitHub).

However, it also commits a slight naughtiness by hoping to produce user links that look like Technically you’re meant to escape @ symbols in URL paths too. Can someone confirm if Ruby’s URL parser chokes on the unescaped @?

(cpradio) #12

The link works here without throwing an error, so it seems to support it.

(Matthew Hodgson) #13

fab. in which case i think we’ll have a viable workaround once someone fixes Support links without the # anonymiser · Issue #10 · matrix-org/ · GitHub - and meanwhile Room URLs are technically invalid · Issue #3550 · vector-im/riot-web · GitHub has been filed to fix Riot’s native URLs.

(Jeff Atwood) #14

Honestly the best thing is for you guys to fix your stuff, which you’re here and doing, so :+1: cheers for that.

(Matthew Hodgson) #15

for sure. just seems a bit weird that the Ruby parser is being more unforgiving than any other one we’ve encountered. I’m pretty sure we’re not the only people out there who cheat on escaping some inoffensive URL characters in order to make their URLs look prettier… I wonder whether the story here is actually that the Ruby parser just needs a greedy matcher when looking for #'s or something.

edit: eitherway, we’re fixing it, as you say :slight_smile: