Hi! At NLnet Labs, we’ve been setting up Discourse for our products (community.nlnetlabs.nl). A user asked about getting the RSS feed for a particular topic (e.g. https://community.nlnetlabs.nl/c/cascade/10), as their RSS reader couldn’t find it.
I tried using that topic-specific page with my RSS feed reader of choice, and it found two feeds: “NLnet Labs Community - Latest Posts” (/posts.rss) and “NLnet Labs Community - Latest topics” (/latest.rss). I know that /c/cascade/10.rss is a valid RSS feed, but my feeder couldn’t find it automatically. This is a bit frustrating, as we will need to start communicating these URLs ourselves.
I’ve investigated automatic RSS feed discovery for my personal website, so I have some experience with this. I checked the <head> of the web page; I noticed the following links:
<link rel="alternate" type="application/rss+xml" title="Latest posts" href="https://community.nlnetlabs.nl/posts.rss">
<link rel="alternate" type="application/rss+xml" title="Latest topics" href="https://community.nlnetlabs.nl/latest.rss">
<link rel="alternate nofollow" type="application/rss+xml" title="RSS feed of topics in the 'Cascade' category" href="https://community.nlnetlabs.nl/c/cascade/10.rss">
So the <head> does include a third link for the topic-specific RSS feed; but it appears that some RSS feed readers don’t like the rel=”nofollow” attribute.
Of course, I checked MDN ( HTML attribute: rel - HTML | MDN ); nofollow is documented as:
Indicates that the current document’s original author or publisher does not endorse the referenced document.
But also:
Relevant to
<form>,<a>, and<area>, thenofollowkeyword tells search engine spiders to ignore the link relationship. The nofollow relationship may indicate the current document’s owner does not endorse the referenced document. It is often included by Search Engine Optimizers pretending their link farms are not spam pages.
I looked through the Discourse source code on GitHub, and with some searches and Git blame was able to find FEATURE: add nofollow to RSS alternate link in topics and categories by rr-it · Pull Request #16013 · discourse/discourse · GitHub . So I guess the second meaning to rel=”nofollow” was intended here. Following the background discussion, it seems to be helpful for guiding prioritization in site crawlers. There was some additional follow-up in Search engines now blocked from indexing non-canonical pages - #4 by rrit , but I couldn’t figure out whether rel="nofollow” is still important.
I couldn’t find any discussion on Discourse Meta about this issue, even though the PR was merged back in 2022. Clearly, there’s a misunderstanding in the conventions around <link>s for RSS feeds, between some RSS feed readers and Discourse. So I ask:
- Does
rel=”nofollow”still serve its original intention for improving site crawler prioritization, or has it been superseded by other techniques? - Does this behavior (i.e. ignoring
rel=”nofollow”links) in RSS feed reader autodiscovery appear to be common? Can others replicate it? I’m not aware of an authoritative standard on RSS feed auto-discovery. - Is there willingness to support this use case, for RSS feed readers to automatically discover the right posts? The existence of those topic-specific
<link>s, even if they’re not getting used by my feeder, makes me think so; perhaps the loss of functionality was simply overlooked whenrel=”nofollow”was added.
To the Discourse devs: thanks for building this!