GitHub "permanent link" bot silently ruined the meaning of my post

About a year ago, I made a post on GitHub’s Discourse instance that included a bunch of URLs of the form “https://github.com/OWNER/REPO/tree/BRANCH/path” in order to discuss how GitHub.com processes such URLs. My post promptly received a system-generated edit with the message “Github link was replaced by a permanent link”, which appears to be coming from the discourse-github plugin. While replacing the branch name with a permanent link to the current commit ID may be a useful feature in the common case of a post citing particular code, in the special case of discussing GitHub URL processing, the edit destroyed the meaning of my post. I was lucky enough to notice the bot edit right away, and after several rounds of fighting with the bot, I eventually found a workaround of adding a <span> tag to prevent the bot’s pattern from matching, like this:

https://github.com/OWNER/REPO/tree/<span>BRANCH</span>/PATH

But other authors might not notice the bot edit and might be left with a post that would confuse readers.

What is the best solution to avoid undesired GitHub permanent link edits to a particular post? In general, I feel like it’s wrong for bots to make automatic edits that risk ruining a post. It would be safer to (1) ask the author when a post is saved whether the links should be edited or (2) have the bot add a permanent link without removing the original link. (I seem to recall seeing some bots on other web sites, maybe Reddit, that add information without deleting the existing information.) If the Discourse maintainers consider those options too ugly or too much work to accommodate a rare use case, some other options might be to (3) show a notice after the post is saved with a link to information about how the author can avoid the edits if needed, either as (a) a dedicated banner in the UI or (b) just a line of text added by the bot to the end of the post.

I’m not sure what would be the most reasonable design for the author to opt out of the edits. The discourse-github plugin’s site-wide exclusion settings based on the link target don’t seem well-suited for this purpose. Perhaps my current workaround with the <span> tag is adequate. Even if no change is made to Discourse, I hope this post will make the workaround easier to discover for authors who do notice the problem.

Note: I previously raised this issue on GitHub’s forum because I assumed the “permanent link” bot was specific to GitHub’s instance, but a commenter there clued me in that it is a general Discourse feature, so I’m raising the issue here.

Thanks for your attention!

2 Likes

I think this is a good feature, because people often paste links to master and these almost always grow outdated over time. Still, it should be possible to intentionally paste a link to a branch as there are many valid reasons to do this. Also generally this feature seems a bit broken, it rewrites things it shouldn’t and doesn’t parse things that it probably should.

Here are some examples that could be used as test cases to fix it:

  1. Plain link made by just pasting a URL. I would expect this to be rewritten, and it is: subdomain-static/forums-enhancements.js at master · ClassicPress/subdomain-static · GitHub
  2. Markdown link of the form [url](url). I would expect this link not to be rewritten, because I have explicitly specified both the text and the URL. Instead, the link text is rewritten, and the link URL is not. This is broken: https://github.com/ClassicPress/subdomain-static/blob/master/forums-enhancements.js
  3. URL enclosed in backquotes. This is not a link and should not be rewritten, but it is: https://github.com/ClassicPress/subdomain-static/blob/master/forums-enhancements.js
  4. URL in a triple-backquoted code block. This is not a link and should not be rewritten, but it is:
    https://github.com/ClassicPress/subdomain-static/blob/master/forums-enhancements.js
    

I think only (1) above should be rewritten. This would make the behavior more predictable, and only rewrite “plain” links. Links where a specific markdown structure has been used (can be thought of as a way to express a specific intention) should be left alone.

1 Like

This feature appears to not be enabled on meta.discourse.org?

FWIW, I disagree: I think in (2) edit: the general case of [text](URL) (call it (2a)), the link URL should be rewritten the same way as (1). (I agree that the current behavior of rewriting the text and not the URL is completely broken.) I make the decision between writing (1) and (2a) based on whether I think it’s useful or distracting for the URL to be visible to the reader, not based on any intent about whether the link should point to the version of the code as of writing or as of reading. Of course, I’m aware of the permanent-link issue, so I make a permanent link myself whenever I want one. But more generally, if a Discourse administrator decides to enable the permanent link bot, presumably that’s because they think most of their users aren’t aware of the risk that branch-name-based links can rot, and I don’t think the use of the Markdown link syntax is much of a signal that a given user is aware of the problem but wants to opt out of that particular link being rewritten.

But I think we’re both just speculating here. As an advanced user, I don’t care much what the default is as long as I can override it as needed.

Yes, exactly. Currently there is no way to override it. Writing [url](url) (link text and URL are exactly the same) definitely would be a way to signal to the bot that that link shouldn’t be rewritten, because there is no other reason to write it that way.

There is if you want to give the link your own title rather than having it inferred from the target URL, i.e. [title](url). Giving the link a title wouldn’t indicate any preference for URL rewriting so I agree with @mattmccutchen that 1 and 2 should behave consistently for URL rewriting.

There could be an argument for the title exactly matching the URL being an indication that it shouldn’t be rewritten but what if a user wants to provide a title and wants the URL to not be rewritten? There needs to be some other method of specifying that.

Something that comes to mind would be a title suffix similar to embedded image sizing, though I’m not sure how a user would discover that.

An embedded image can be sized like this:
![title|100x200](url)

So the discourse-github plugin could (presumably) be made to look for something like this:
[title|github-no-rewrite](url)

Ah, it wasn’t clear to me that your (2) was referring only to the special case where the text and URL are the same. My statement was for the general case where the text and URL may not be the same; let’s call that (2a) now.

In case (2), I agree that it’s weird to rewrite the URL and not the text, leaving them inconsistent, but ISTM one could equally well argue that if we want to avoid the inconsistency, the best way to do that is to rewrite both the URL and the text rather than neither. So I don’t find the argument to treat (2) as an opt-out to be compelling. Given that we should have an opt-out that works for (2a), I’d be inclined to just let users use the same opt-out for (2) and not complicate the design. (I think this may have been Simon Manning’s idea as well?)

Not sure I’m following this right (or if it’s possible), but could you use the space escape as in the Inline pdf previews - #45 by Johani? So [ text]( url) would rewrite neither the text or the url, and anything else would be auto-changed?

This version should stay as-is and not be rewritten, let me see:

https://github.com/correctcomputation/checkedc-clang/blob/master-post-microsoft/clang/docs/checkedc/Setup-and-Build.md

written as:

<https://github.com/correctcomputation/checkedc-clang/blob/master-post-microsoft/clang/docs/checkedc/Setup-and-Build.md>

Not a valid test because GitHub permalink rewriting is disabled entirely on this Discourse instance. (I wonder what that says about this feature if it is disabled on the “official” instance :upside_down_face:)

If you were to write this example as a test case for replace_github_non_permalinks.rb / replace_github_non_permalinks_spec.rb instead, then I think you would find that link also gets rewritten.

1 Like