A drop in ranking/reputation of WP’s articles (originals) since addition. I was not looking to start a debate or complain, just asking how to achieve this. I’m sure there are many other reasons others may want to have this level of control for what’s indexed.
There’s nothing you can do in the WP Discourse plugin to add posts it creates in Discourse to a robots.txt file. This is actually just a pure Discourse question, namely “Can I automatically noindex embedded topics?” (or something along those lines). A topic embedded from Wordpress is functionally the same as any other embedded topic. The avenue of investigation you want to pursue is there, for example the origin of the embed set canonical url site setting and related discussions.
I don’t think (but happy to be corrected) that what you want to do is a current Discourse feature. Discourse currently adds a X-Robots-Tag: noindex header to GET requests for hidden topics. You could do the same for embedded topics via a plugin.
Heading in the wrong direction to block indexing of a forum thread with the duplicate article that I prefer Google search users to find via the WP blog? I’m ok with that. The Wp-discourse main benefit for me has been allowing discussions of blog posts without having to use solutions like Disqus or the very limited default WP comments. I don’t need any SEO benefit from the forums unless it’s other unique threads that are not connected to already existing content.
To clarify, if I make the category that stores the WP-discourse connected post hidden (is hidden different to private?) then it will hide the post from the forums/public/crawlers but the inserted comments at the end of each Wordpress blog post with comments will still be visible?
Sorry about the noob questions, I’m not experienced with Discourse and want to make sure I’m not misinterpreting your response.
…depends on your definition of duplicate. Canonical is in place, but for me personally, since both the Blog post and the forum thread contain the exact paste/text (duplicate). I would like just to block those threads altogether. That’s just my preference. Maybe in the future, the reasoning behind this topic will make more sense. But for now, I am honestly not trying to provoke a debate or anything like that. I think think that blocking is a more absolute solution for me.
It’s like going to your mechanic and asking him to “change your oil twice”. I understand the initial “why” by @angus - but in the end, it’s just about whether it can be done somehow, or not possible.
Edit: Now thinking about it, I could then just add the blog post forum category to robots.txt, correct? Or will that be overwritten? (I will search the forums for how Discourse robots.txt works/can be edited.
Keep in mind both what I said up top, and what it says next to that setting. This will mean that topics published from Discourse to Wordpress do not appear on the topic lists of your forum. Comments will work in the normal fashion. If you have the sync comment data webhook enabled the topic will no longer be hidden after the first comment. That feature wasn’t exactly designed for this purpose. See further
If you want to just add a X-Robots-Tag: noindex header to an embedded topic (without bothering about this hidden business), you’ll need to either request that as a new feature of Discourse itself or add it via a plugin.
@haydenjames The one final thing I’d note is that there seems to have been an issue with the canonical url of embedded topics recently. Something to keep in mind if you just noticed this issue recently.
Even though the rel=canonical tag can help you avoid a duplicate content penalty when you republish posts, you can still get penalized if you misuse the tag. I’ll find a solution. Will bump this thread at a later date.