GitHub 的“永久链接”机器人悄悄地毁了我的帖子的意思

About a year ago, I made a post on GitHub’s Discourse instance that included a bunch of URLs of the form “https://github.com/OWNER/REPO/tree/BRANCH/path” in order to discuss how GitHub.com processes such URLs. My post promptly received a system-generated edit with the message “Github link was replaced by a permanent link”, which appears to be coming from the discourse-github plugin. While replacing the branch name with a permanent link to the current commit ID may be a useful feature in the common case of a post citing particular code, in the special case of discussing GitHub URL processing, the edit destroyed the meaning of my post. I was lucky enough to notice the bot edit right away, and after several rounds of fighting with the bot, I eventually found a workaround of adding a <span> tag to prevent the bot’s pattern from matching, like this:

https://github.com/OWNER/REPO/tree/<span>BRANCH</span>/PATH

But other authors might not notice the bot edit and might be left with a post that would confuse readers.

What is the best solution to avoid undesired GitHub permanent link edits to a particular post? In general, I feel like it’s wrong for bots to make automatic edits that risk ruining a post. It would be safer to (1) ask the author when a post is saved whether the links should be edited or (2) have the bot add a permanent link without removing the original link. (I seem to recall seeing some bots on other web sites, maybe Reddit, that add information without deleting the existing information.) If the Discourse maintainers consider those options too ugly or too much work to accommodate a rare use case, some other options might be to (3) show a notice after the post is saved with a link to information about how the author can avoid the edits if needed, either as (a) a dedicated banner in the UI or (b) just a line of text added by the bot to the end of the post.

I’m not sure what would be the most reasonable design for the author to opt out of the edits. The discourse-github plugin’s site-wide exclusion settings based on the link target don’t seem well-suited for this purpose. Perhaps my current workaround with the <span> tag is adequate. Even if no change is made to Discourse, I hope this post will make the workaround easier to discover for authors who do notice the problem.

Note: I previously raised this issue on GitHub’s forum because I assumed the “permanent link” bot was specific to GitHub’s instance, but a commenter there clued me in that it is a general Discourse feature, so I’m raising the issue here.

Thanks for your attention!

2 个赞

I think this is a good feature, because people often paste links to master and these almost always grow outdated over time. Still, it should be possible to intentionally paste a link to a branch as there are many valid reasons to do this. Also generally this feature seems a bit broken, it rewrites things it shouldn’t and doesn’t parse things that it probably should.

Here are some examples that could be used as test cases to fix it:

  1. Plain link made by just pasting a URL. I would expect this to be rewritten, and it is: subdomain-static/forums-enhancements.js at master · ClassicPress/subdomain-static · GitHub
  2. Markdown link of the form [url](url). I would expect this link not to be rewritten, because I have explicitly specified both the text and the URL. Instead, the link text is rewritten, and the link URL is not. This is broken: https://github.com/ClassicPress/subdomain-static/blob/master/forums-enhancements.js
  3. URL enclosed in backquotes. This is not a link and should not be rewritten, but it is: https://github.com/ClassicPress/subdomain-static/blob/master/forums-enhancements.js
  4. URL in a triple-backquoted code block. This is not a link and should not be rewritten, but it is:
    https://github.com/ClassicPress/subdomain-static/blob/master/forums-enhancements.js
    

I think only (1) above should be rewritten. This would make the behavior more predictable, and only rewrite “plain” links. Links where a specific markdown structure has been used (can be thought of as a way to express a specific intention) should be left alone.

1 个赞

This feature appears to not be enabled on meta.discourse.org?

FWIW,我不同意:我认为在 (2) 编辑: [text](URL) 的一般情况(称之为 (2a))下,链接 URL 应与 (1) 以相同方式重写。(我同意当前重写文本而不重写 URL 的行为完全是错误的。)我根据是否认为读者看到 URL 有用或分散注意力来决定是编写 (1) 还是 (2a),而不是根据链接是否应指向编写时或阅读时的代码版本来决定。当然,我意识到了永久链接的问题,所以当我想要一个永久链接时,我自己就会创建一个。但更普遍地说,如果 Discourse 管理员决定启用永久链接机器人,大概是因为他们认为他们的大多数用户都没有意识到基于分支名称的链接可能会过时,而我不认为使用 Markdown 链接语法在很大程度上表明某个用户意识到了这个问题但希望选择不重写该特定链接。

但我认为我们都在猜测。作为一个高级用户,只要我能根据需要覆盖它,我不太关心默认设置是什么。

是的,正是如此。目前没有办法覆盖它。编写 [url](url)(链接文本和 URL 完全相同)肯定是一种向机器人发出信号的方式,表明该链接不应被重写,因为没有其他理由以这种方式编写它。

如果你想为链接提供自己的标题,而不是从目标 URL 推断出来,即 [标题](URL),那么就有理由这样写。为链接提供标题不会表明偏好 URL 重写,因此我同意 @mattmccutchen 的观点,即 1 和 2 在 URL 重写方面应该保持一致。

可以认为标题与 URL 完全匹配是 URL 不应被重写的指示,但如果用户想提供标题并且不希望 URL 被重写,该怎么办?需要有其他方法来指定这一点。

我想到的可能是标题后缀,类似于嵌入式图像的大小调整,尽管我不确定用户将如何发现它。

嵌入式图像可以这样调整大小:
![标题|100x200](URL)

因此,discourse-github 插件可以(假设)查找类似这样的内容:
[标题|github-no-rewrite](URL)

啊,我没弄清楚你的 (2) 只指文本和 URL 相同这个特殊情况。我的陈述是针对文本和 URL 可能不相同的普遍情况;我们现在称之为 (2a)。

在情况 (2) 下,我同意重写 URL 而不重写文本,使它们不一致,这很奇怪,但 ISTM 人们同样可以争辩说,如果我们想避免不一致,最好的方法是重写 URL 和文本,而不是两者都不重写。所以我不认为将 (2) 作为选择退出是有说服力的。鉴于我们应该有一个适用于 (2a) 的选择退出,我倾向于让用户为 (2) 使用相同的选择退出,而不是使设计复杂化。(我认为这可能是 Simon Manning 的想法?)

不确定我是否正确理解了这一点(或者这是否可行),但您能否像在 Inline PDF Previews - #45 by Johani 中那样使用空格转义?这样 [ text]( url) 就不会重写文本或 URL,而其他任何内容都会自动更改?

此版本保持不变,不应重写,让我看看:

https://github.com/correctcomputation/checkedc-clang/blob/master-post-microsoft/clang/docs/checkedc/Setup-and-Build.md

写成:

<https://github.com/correctcomputation/checkedc-clang/blob/master-post-microsoft/clang/docs/checkedc/Setup-and-Build.md>

这不是一个有效的测试,因为在此 Discourse 实例上完全禁用了 GitHub 永久链接重写。(我想知道如果此功能在“官方”实例上被禁用,这说明了什么 :upside_down_face:

如果你要将此示例作为 replace_github_non_permalinks.rb / replace_github_non_permalinks_spec.rb 的测试用例来编写,那么我认为你会发现该链接也会被重写。

1 个赞