Seeing anonymous user and crawler traffic, even though site is private

I help run a private Discourse instance, and couldn’t help but notice that there is some recorded anonymous user and Web crawler traffic showing in my dashboard. Now that I look closely, I see that it was happening before too, but in lesser amounts.

I have the “login required” option enabled, and we have our SSO set up to only allow logins for users who meet certain criteria. Is there another setting that I should be marking? Thanks! : )

2 Likes

There shouldn’t be anything additional you need to do… that crawler traffic is likely from crawlers hitting community.yoursite.com/login. If you check community.example.com/admin/reports/web_crawlers you can see how often specific crawlers are hitting your site.

There are a couple things you can do to reduce the crawler traffic…

  • Try disallowing /login from crawlers within robots.txt (community.example.com/admin/customize/robots)… you’d probably see some crawler traffic drop (though probably not completely because there are crawlers out there not obeying robots.txt)

  • Take a look at the worst offenders from /admin/reports/web_crawlers and add their user-agents to the blocked crawler user agents site setting

4 Likes

In addition to what Kris wrote, there will also be an anonymous request for your site’s login page or home page at the beginning of each SSO login request.

Your site’s TOS and Privacy pages can also probably be accessed by anonymous users.

3 Likes