Aggiungi opzione per impostare canonical_url su embed_url

From time to time we get requests to set the canonical URL of embedded topics to the URL of the blog post. I’ve created a pull request that does exactly that. It unconditionally uses the URL of the original blog post (embed_url) as canonical for the topic.

There have been various previous discussions like Google indexed link not pointing to the correct post and Duplicate Content in the past.

After reading those posts I’m not so sure about my solution anymore, so I’d like to get some feedback from you.

  • Should this be a configurable? Is there a good reason for keeping the current behavior of always using the topic’s URL as canonical?

  • Should the blog post’s URL only be used as canonical for N pages presented to the search bot? After all, only a certain amount of posts is embedded in the blog post. (N probably should be 1)

I’d appreciate your feedback on this. I’m sure there are lots of different use cases out there and I’d like to make an informed decision before I change anything that could affect search engines.

7 Mi Piace

My thought is that if you are copying and re-posting content from a Blog post for any reason conversation or not, the original blog post is the original content and should be pointed to canonically as the original content.

Yes. And by default don’t enable the blog post’s URL as canonical. Let the customer set the switch. Otherwise this is going to change a lot of Search referral traffic all of a sudden.

IMHO, only the blog post linking top post should be made canonical. The responses and follow ups should not be.

1 Mi Piace

With the WordPress plugin, sites can choose between publishing an excerpt, or publishing the full post to Discourse. Sites that are only publishing an excerpt might not want the canonical URL set to the blog post.

IMO it should be a per host setting here, default off:

3 Mi Piace

That’s not possible. Discourse presents topics as paginated content to crawlers. That’s why I suggested to change only the canonical of the first page.

Yeah, I’m going to make this a per host setting.
@simon Will this work for the Wordpress plugin as well?

Yes, that should work. When a post is published from WordPress it creates a TopicEmbed on Discourse, with the embed_url set to the post’s permalink.

We just got to be careful here… this is a very sharp instrument. If for example wordpress is in “Top N” mode where it show only the best content we can end up setting a canonical to a page that does not have all the overlapping content, this is terrible signal to search engines and can be penalised heavily.

In fact, the whole “collapsing” of OP may make this a bad idea, the OP really should be a complete duplicate of the canonical page, so we may need a different technique there that collapses on client side.

I would not rush anything here.

3 Mi Piace

Howdy folks :wave:

I originally wanted to weigh in here and join in the calls for this feature, but after diving in a bit deeper I wanted to share what I learned about how this works (in case anyone missed it like I did initially!)

We’ve just embedded Discourse as the comment system for our blog and I had a little mini freak out when I clicked the “Show full post…” button and saw the whole blog copied without the correct canonical URL :flushed:

After taking a few deep breaths I went into my “debug mode” and started checking the straight HTML response and checked how much of the post is actually there. As it turns out only the initial paragraph is included in the HTML and therefore this is all Google will see. Phew!

Having a second look at it, it makes perfect sense in the way the UX is laid out. I’m assuming the reason it’s hidden behind a button is because you want people to be able to read the full post and not affect SEO :+1:

I guess initially I was surprised that that “Show full Post…” wasn’t just a link to the original blog :thinking: but I guess it’s an OK way to do it :joy:

9 Mi Piace

Questa funzionalità è stata ora implementata con l’impostazione del sito embed set canonical url. Tale impostazione è disabilitata per impostazione predefinita. Quando abilitata, imposta l’URL canonico per gli argomenti incorporati sull’URL del contenuto incorporato.

La funzionalità esiste da un po’ di tempo. Sarei curioso di sapere da eventuali siti che l’hanno abilitata come ha influito sul loro posizionamento SEO.

6 Mi Piace

Ciao @simon, stavo cercando una soluzione per impostare gli URL canonici su alcuni argomenti della mia community quando ho trovato questo post.

Sembra che questa impostazione possa essere la soluzione, ma non capisco cosa siano gli “argomenti incorporati”. Ho cercato nella community ma non ho trovato spiegazioni. Forse è qualcosa di molto basilare. Potresti dirmi cosa sono gli argomenti incorporati o come incorporarli in una community Discourse?

Un topic incorporato è un topic il cui proprietà embed_url è impostata sull’URL di un sito esterno. Sono a conoscenza di questa pratica solo quando i topic vengono pubblicati su Discourse tramite l’API. Ad esempio, il plugin WordPress per Discourse e il codice di incorporamento JavaScript di Discourse creano entrambi topic incorporati.

Se stai pubblicando i tuoi topic su Discourse da un sito esterno, questo approccio ha senso. Tuttavia, non potrai utilizzare questo approccio per i topic creati direttamente su Discourse.

3 Mi Piace

Quindi può essere utilizzato se cambiamo il titolo di alcuni articoli del blog (con la data di aggiornamento per scopi SEO, ad esempio) e/o per evitare contenuti duplicati?

Ne abbiamo davvero bisogno perché stiamo utilizzando contenuti incorporati da Drupal ed è la prima volta che mi imbatto in questa discussione :neutral_face: