Why semrushbot and ahrefsbot are blocked by default?

I was checking the Google Search Console coverage report and found that lots of our forum pages are blocked by robots.txt. So I went ahead and checked the robots.txt. Then I found that semrushbot and ahrefsbot are blocked by default:
image

I know these are two widely used SEO tools, why blocking their bots?

1 Like

Because those bots are “resource sucking bot hogs” which provide very little value to sites compared to the amount of resources these bots consume.

Of course, you can customize the Discourse robots.txt file and permit them if you wish; but we block these bots on our sites long before Discourse was released and keep them blocked.

:slight_smile:


Note (Edited):

I forgot to mention that many of these “resource sucking bot hogs” do not respect robots.txt and they must be blocked at the HTTP User Agent level. We block these "disrespectful resource sucking bot hogs” with mod_rewrite at the reverse proxy level, generally speaking (one of the many good reasons to run behind a reverse proxy, BTW).

5 Likes

Thanks so much for the information!

I found another issue and maybe you can share your insight on it as well. :slight_smile:

I know Discourse has blocked user pages by default, but in my Google Search Console coverage report, there are still some user pages indexed, which is an issue in Google’s eyes because all these pages should not be indexed:

Thanks!

1 Like

This was fixed recently with

Can you update your Discourse and reverify?

1 Like

@osioke Thanks for your reply! I believe our installed version already has the feature? Because I noticed that the fix was committed in Jan.

Could you please verify if I need to upgrade to the latest version to have this feature?

1 Like

It doesn’t hurt to update IMO, but yes, that fix should be in your installed version. I would try updating and reverifying unless you don’t want to update for some other reason.

3 Likes

Because they suck? They add a lot of server load for no discernable benefit, and our customers do have pageview limits on their plans.

5 Likes

Sounds good. We are updating now. Hope things will work out after the update. I’ll get back and keep you informed. :slight_smile: Thanks!