What is the Discourse approach to SEO?

John_Mckay · October 23, 2015, 7:29pm

Hi guys

I really love Discourse, I am working on an SPA too, and have been following your development closely. When you make an architectural decision, we tend to roll with that as well.

You use RoR and EmberJS, we use ASP.Net MVC and AngularJS. Different technologies, but similar conceptually on the client and server.

We have noticed that you don’t seem to serve web crawlers (googlebot, bingbot, etc) your Javascript. You never followed Google’s hash fragment recommendations, which is good news because that was recently deprecated.

So we have been doing the same, we opted not to follow the hash fragment recommendations either, and when a web crawler arrives at our website, we give them a server generated version of that page. If a regular human arrives, they get the full SPA experience.

One of the things I have noticed, in particular, is that when I check Google cache from before we stopped giving Google our JS, it’s a complete mess. The Google cache page tries to hit up our api (CORS says No!) and download views (again, CORS).

Since we stopped giving them the JS, and instead opted for the server-side rendered version, things are much nicer when viewing the Google cache version of the page.

However, Google seem pretty, consistently, adamant that they can handle Javascript.I have been talking to people on other forums who suggest that we should just let Google let it rip with our SPA,and that the cache view isn’t important.

Any feedback would be really welcome. Are you guys planning at some point to stop serving up server side rendered versions of any particular page to web crawlers, or are you going to continue to do so?

Do you honestly believe that Google can handle JS as well as they say they can?

Apologies if this is off topic, but hearing your opinions/views on this would be really great.

Thanks!

codinghorror · October 23, 2015, 7:47pm

If you visit with your user agent set to Google you will see what we produce. We have not had any problems with that approach so far. (And we do support the hash thing because of other search engines like Yandex which are far less sophisticated than Google)

John_Mckay · October 23, 2015, 7:52pm

Thanks for taking the time to answer, and that’s good to know.

I was under the impression that having a # in the URL was the indication to any crawler that you follow that particular protocol, that was how you let a crawler know.

Is that not the case? Also do you think it is worth putting in the work now to implement that functionality now that Google have deprecated it?

codinghorror · October 23, 2015, 7:54pm

Not unless every search engine on the planet has deprecated it too. We do try to detect crawlers and send them the 1996 html 1.0 version of the site.

John_Mckay · October 23, 2015, 7:54pm

Ok, I thought it was mainly a Google thing.

Thanks very much

John_Mckay · October 23, 2015, 8:01pm

Sorry, just to clarify, when googlebot visits Discourse, they get the 1996 HTML 1.0 version of that pages content?

I know that is what I get when setting my user agent to googlebot, but maybe you use other metrics to detect it and deliver them something different?

riking · October 24, 2015, 9:11pm

Nope, just dumb user agent detection. If Google goes and does something like “let’s try loading the page with a clean Chrome browser to see if they’re pulling anything tricky” they’ll see the same content, just after the JS all loads.

Idan · June 21, 2016, 1:30am

for some reason the text “Not unless every search engine on the planet has deprecated it too” wasn’t index by google, worries me a bit.

codinghorror · June 21, 2016, 2:03am

I think you are doing something wrong.

Topic		Replies	Views
Disable or bypass feature detect for Googlebot (while serving JS app to crawlers) Support unsupported-install	8	3202	June 14, 2022
Googlebot is getting non-javascript version of the site Dev	16	1501	March 9, 2024
Google and JS scripts Support	8	721	April 30, 2023
Can we have a conversation about SEO? Dev	3	802	April 4, 2022
How public Discourse sites are indexed by search engines like Google Site Management reference	0	12616	February 6, 2013

What is the Discourse approach to SEO?

Related topics