لماذا توجد قواعد Disallow كثيرة في robots.txt؟

Just reviving this.

  1. You can now edit robot.txt file to taste if you want to

  2. We always supply x-robots-tag noindex on pages that should not be indexed

  3. Turns out that some crawlers “go to town” on sites if we do not give strict guidance in robots.txt, not everyone is Google. We have an incredibly vanilla robots.txt file these days, and it comes at a cost. (We expect everyone to be as well behaved as Google and it takes a massive effort to become Google)

I think we should probably bring back the “very limiting” robots.txt by default at least for all non googlebots.

4 إعجابات