Continuing the discussion from SEO for Thin Content or Modify Meta Tags:
I’m struggling with the same problem, here.
I’m using WP-Discourse and it is great! But for every new blog post, it creates a topic with the exact same title in my community. Two URLs with the same title is not a good thing, since it steals relevance from each other in search results.
Then the comments from the topic are also printed below the blog posts, which generates duplicated content (same content across multiple URLs).
Both are huge SEO problems, that could lead to a domain penalisation.
How to fix this?
The solution would be a simple checkbox in the category configuration box:
[ ] Hide Topics from this category in search results
.
When the checkbox is marked, a noindex tag would be inserted in the header of all the pages related to it: the category itself, topics, pagination, etc.
<meta name=“robots” content=“noindex, dofollow”>
This way, everything is still there to the users, but ignored by search engines.
Things that doesn’t fix the problem
Let me go a few steps ahead, and address some common responses. I saw a few topics about this issue, and they all had suggestions that doesn’t actually fix the problem.
Robots.txt
The most common solution presented, is to add a “disallow: /c/category/id
” in robots.txt. But this would only remove the category itself from the search results and not the topics, which is the main problem here.
The URL structure of the topics are all the same, so we can’t block them by simply adding a “disavow” line in robots.txt
Ex:
Unlist topics
A unlisted topic is still visible to search engines. It will be hidden in the community listings, but you can still access the topic if you have the direct link. And we need to send users to the topics, so we add a link to it in the blog post. So the search engines will also find all the unlisted topics.
Notice that nofollowing this link won’t make googlebot ignore it: Official Google Webmaster Central Blog: Evolving “nofollow” – new ways to identify the nature of links
At the same time, unlisting the topic leads to a reduction in user engagement, because the users won’t be able to jump from one topic to another inside the community.
So this idea doesn’t solve anything. It leads to a reduction in engagement, while not hiding the topics from search engines at all.
Require login to see the topics in that category
When a new user clicks in the comment button, he/she will se a “This page doesn’t exists” message, instead of the topic. The user thinks something is broken and then leaves the site. So no comments and no new user registration. Very bad for engagement and usability.
In conclusion, it would be very useful to have this option added to Discourse, or if someone could develop a simple plugin.
It needs to be added to the core, or googlebot will ignore the javascript.
The SEO guys would much appreciate it!