I have few duplicate pages on my domain, I have to reference canonical tag of the duplicate pages to original page using JavaScript. (Deleting duplicate pages is not an option as they do have considerable traffic)
Could someone suggest how to update a href tag using JavaScript in discourse.
Here ya go @KranthiKiranGude, this is how you could change the href
attribute in javascript. First you select the DOM element, then you change the attribute.
<script>
var uC = document.querySelectorAll("link[rel='canonical']")[0];
var newURL = "https://my.coolforum.com/newlink";
uC.setAttribute("href", newURL);
</script>
Of course, you will need some logic based on the page you want operate on.
Generic example logic:
<script>
if("the_actual_page_url_or_id" == "my_interesting_page_url_or_id")
{
var uC = document.querySelectorAll("link[rel='canonical']")[0];
var newURL = "https://my.coolforum.com/newlink";
uC.setAttribute("href", newURL);
}
</script>
Hope this helps.
Hi @neounix,
I have tried you code, but instead of updating the href, a new script tag got generated:
I have updated this script in ā/headā section.
Please post the exact code you used and where exactly you added it, including a screen shot of the entry in the </head>
section you mentioned.
Thanks!
Seems normal that you will have new Javascript generated when you add more Javascript.
You will need to check the DOM in the web dev console (the elements), not in the page source code, BTW.
I understand.
You are missing an opening quote in your script conditional statement BTW ā¦
Hi @neounix,
It worked in the Dev Console. But, in Page Source it still references to the actual URL.
If I am not wrong, Search Engines will pick from Page Source not the DOM Elements. Please correct me if I am wrong.
Iām not sure about that, to be honest. I thought before that modern search engines (GoogleBot) will read the DOM, but now that I think about it, it makes since that search engines might only read the source and not the DOM.
But ā¦ when I Google to check this, it says:
SEO signals in the DOM (page titles, meta descriptions, canonical tags, meta robots tags, etc.) are respected. Content dynamically inserted in the DOM is also crawlable and indexable. Furthermore, in certain cases, the DOM signals may even take precedence over contradictory statements in HTML source code. This will need more work, but was the case for several of our tests.
Reference:
Hi @neounix,
Thanks a lot for your help. Let me also research on this part. But, really thankful to you.
Welcome!
Please post back and let us know the results of your research.
Another method, which I have been working on in my spare time lately, is to modify this Discourse Ruby lib file directly:
You might consider something along that line if you have no joy with the DOM manipulation JS technique, @KranthiKiranGude
Hi @neounix,
I tested the page using URL Inspection tool, Google is recognizing the updated URL.
Perfectā¦ glad to hear it worked.
Thanks for testing and posting back.
PS: That JS DOM method is a lot easier than manipulating canonical_url.rb
Iām not sure if overriding canonical using Javascript will work since this is something that is more on the spider level (i.e. the part that retrieves and collects data) than on the indexer level (the part of a bot that interprets data and stores it in the search index).
Unsolicited advice: you might want to read this topic so you can put those overrides in a plugin:
Yeah, me too. The jury is still out on that one.
But Google searches on this topic yield a lot of fruit, where many people do this, and many say Google respects the DOM changes (and some say they do not, so there seems not be be a strong, overwhelming consensus on the topic), see for example:
I think if I was going to do it, I would (1) delete the original canonical tag from the page source and then (2) insert a new canonical tag in the DOM with JS.
Then, over time we can simply look at the Google search console and see what Google selected as canonical.
See also:
Because many people consider this important for SEO, I checked on this again, in light of this confirmation by @KranthiKiranGude
According to developers.google.com
, Understand the JavaScript SEO basics:
Googlebot supports web components. When Googlebot renders a page, it flattens the shadow DOM and light DOM content. This means Googlebot can only see content thatās visible in the rendered HTML. To make sure that Googlebot can still see your content after itās rendered, use the Mobile-Friendly Test or the URL Inspection Tool and look at the rendered HTML.
Because (1) @KranthiKiranGude used his URL Inspection Tool
in his testing and (2) he confirmed the canonical was changed as expected in this way, then it follows that per Google, GoogleBot does indeed āseeā and registers this DOM content change after the page is rendered.
Reference:
Yeah, I totally support the idea of Google flattening the DOM contents like that while indexing.
But some/most meta
tags have their semantics at the HTTP protocol level rather than at the HTML protocol level, despite the fact that theyāre being present in the HTML. I emphasized the āwhile indexingā because I am not sure it flattens the DOM like that and takes the updated canonical URL into account while crawling.
(To put it differently, Iām not sure if DOM contents also means āmetadata embedded in the contentā. Yes it sees it that way but Iām not sure if it will use it that way).
Maybe this article explains it better: How Google Crawls Your Website and Indexes Your Content
When Google needs to crawls JavaScript sites, an addition stage is required that traditional HTML content doesnāt need. It is know as the rendering stage, which something takes additional time. The indexing stage and rendering stage are separate phases, which lets Google index the non-JavaScript content first
.
Not really, sorry. That article by www.hillwebcreations.com
does not even mention the DOM, how to inspect the DOM, etc. and at least to me, it reads, ādated and opinionatedā (not really current, nor factual).
Personally, I prefer these two well written references, both with more authority, factual and well referenced, in my mind:
and the first one where they actually tested (and that was long before GoogleBot switched to a Chromium core which could read the DOM (Javascript) even better):
We Tested How Googlebot Crawls Javascript And Hereās What We Learned
After my research, I tend to agree with Google developers that they will index (and get their SEO signals from) what is found by using the URL Inspection Tool and it is from this, we can ājudgeā SEO signals and content, The discussion by Google is clear, factual, authoritative, and non-speculative.
Because@KranthiKiranGude has confirmed his canonical link was updated using the URL Inspection Tool which Google, as the authority said, is āall you needā to see how Google views a page from an SEO perspective.
Technical Summary
Because Google uses SEO signals from what can be seen from their URL Inspection Tool; and the fact that Google Developers have clearly stated that their SEO signals can be directly analyzed by the URL Inspection Tool; and the fact the JS changes @KranthiKiranGude made to the DOM are visible in the URL Inspection Tool, thatās āmore than good enoughā, in my view.
HTH
Yes, that article indeed clearly states that they have seen canonical tags that were dynamically inserted behave exactly the same as if they were in source code. You are right (and I should have read this more thoroughly the first time you posted it).
Although three of the four pages you referred to in this topic, including the one that gave us the answer, are even older than that article I posted
OBTW @RGJ, sorry for the confusion about ānot currentāā¦
When I use the term ādatedā or ānot currentā I am talking about concepts and ideas, not the physical date of any article.
Some people write articles with dates from ātodayā and the concepts are ādatedā (and wrong) and some people have written articles from 10 years ago, which are still highly relevant today.
That is what I mean by ādatedā or ānot currentā, it is based on āconceptsā not physical dates written on paper or a web article. Sorry for any confusion in my reply using the terms in this manner.
What is important, at least in my mind, is that we provided a solution for @KranthiKiranGude and he confirmed it works and based on your skeptical post, we both did some additional research for this issue.
We verified (1) that this method, changing the canonical link using Javascript, is valid; and that (2) Google developers have confirmed it; and (3) we have a way to confirm it as users, using the URL Inspection Tool (as @KranthiKiranGude used and shared with us).
All the best and thanks so much for the āback-and-forthā on this interesting topic and for helping make the solution even more valid and stronger.
Iām off to other tasks (still struggling to learn Ruby on Rails after over a decade of PHP coding) ; as this topic is fully āmission accomplishedā
Until next timeā¦ all the best!