添加将 canonical_url 设置为 embed_url 的选项

From time to time we get requests to set the canonical URL of embedded topics to the URL of the blog post. I’ve created a pull request that does exactly that. It unconditionally uses the URL of the original blog post (embed_url) as canonical for the topic.

There have been various previous discussions like Google indexed link not pointing to the correct post and Duplicate Content in the past.

After reading those posts I’m not so sure about my solution anymore, so I’d like to get some feedback from you.

  • Should this be a configurable? Is there a good reason for keeping the current behavior of always using the topic’s URL as canonical?

  • Should the blog post’s URL only be used as canonical for N pages presented to the search bot? After all, only a certain amount of posts is embedded in the blog post. (N probably should be 1)

I’d appreciate your feedback on this. I’m sure there are lots of different use cases out there and I’d like to make an informed decision before I change anything that could affect search engines.

7 个赞

My thought is that if you are copying and re-posting content from a Blog post for any reason conversation or not, the original blog post is the original content and should be pointed to canonically as the original content.

Yes. And by default don’t enable the blog post’s URL as canonical. Let the customer set the switch. Otherwise this is going to change a lot of Search referral traffic all of a sudden.

IMHO, only the blog post linking top post should be made canonical. The responses and follow ups should not be.

1 个赞

With the WordPress plugin, sites can choose between publishing an excerpt, or publishing the full post to Discourse. Sites that are only publishing an excerpt might not want the canonical URL set to the blog post.

IMO it should be a per host setting here, default off:

3 个赞

That’s not possible. Discourse presents topics as paginated content to crawlers. That’s why I suggested to change only the canonical of the first page.

Yeah, I’m going to make this a per host setting.
@simon Will this work for the Wordpress plugin as well?

Yes, that should work. When a post is published from WordPress it creates a TopicEmbed on Discourse, with the embed_url set to the post’s permalink.

We just got to be careful here… this is a very sharp instrument. If for example wordpress is in “Top N” mode where it show only the best content we can end up setting a canonical to a page that does not have all the overlapping content, this is terrible signal to search engines and can be penalised heavily.

In fact, the whole “collapsing” of OP may make this a bad idea, the OP really should be a complete duplicate of the canonical page, so we may need a different technique there that collapses on client side.

I would not rush anything here.

3 个赞

Howdy folks :wave:

I originally wanted to weigh in here and join in the calls for this feature, but after diving in a bit deeper I wanted to share what I learned about how this works (in case anyone missed it like I did initially!)

We’ve just embedded Discourse as the comment system for our blog and I had a little mini freak out when I clicked the “Show full post…” button and saw the whole blog copied without the correct canonical URL :flushed:

After taking a few deep breaths I went into my “debug mode” and started checking the straight HTML response and checked how much of the post is actually there. As it turns out only the initial paragraph is included in the HTML and therefore this is all Google will see. Phew!

Having a second look at it, it makes perfect sense in the way the UX is laid out. I’m assuming the reason it’s hidden behind a button is because you want people to be able to read the full post and not affect SEO :+1:

I guess initially I was surprised that that “Show full Post…” wasn’t just a link to the original blog :thinking: but I guess it’s an OK way to do it :joy:

9 个赞

此功能现已通过 embed set canonical url 站点设置实现。该设置默认处于禁用状态。启用后,它会将嵌入话题的规范 URL 设置为嵌入内容的 URL。

该功能已存在一段时间了。我很想了解已启用此功能的站点,它如何影响了它们的 SEO 排名。

6 个赞

@simon,我一直在为我的社区中部分主题设置规范 URL 而寻找解决方案,偶然发现了这篇帖子。

看起来这个设置可能提供了解决方案,但我不明白什么是“嵌入主题”。我尝试在这个社区中搜索相关内容,但没找到任何解释。也许这是一个非常基础的概念。能否请你告诉我什么是嵌入主题,或者如何在 Discourse 社区中嵌入主题?

嵌入主题是指其 embed_url 属性被设置为外部网站 URL 的主题。据我所知,这仅在通过 API 将主题发布到 Discourse 时才会发生。例如,Discourse WordPress 插件和 Discourse JavaScript 嵌入代码都会创建嵌入主题。

如果您是从外部网站将主题发布到 Discourse,那么这种方法是有意义的。不过,对于直接在 Discourse 上创建的主题,您将无法使用这种方法。

3 个赞

因此,如果我们更改某些博客帖子的标题(例如为了 SEO 目的添加更新日期)和/或为了避免内容重复,这就可以派上用场?

我们确实需要这个功能,因为我们正在使用来自 Drupal 的嵌入内容,这是我第一次遇到这个问题 :neutral_face: