Unidentified Crawler with High Amounts of Pageviews

Hi all!

When we look at our crawler page view count, there seems to be an unidentified entry that is racking up over 500k page views in a month:

Is there any easy way to find out what this could be? It seems to average some 10,000-15,000+ hits a day.

2 Likes

If you are on our hosting email support and we can handle it for you.

1 Like

We’re a self hosted FLOSS project, so I suppose that’s out of the question :slight_smile:

I know I could put some more filtering on there and get our infra guy to look at more logs - I guess I was just wondering if anyone else had seen this before.

3 Likes

Check the nginx logs (access.log) for requests coming with that user agent and the respective IP.

4 Likes

I just met with our lead SysAdmin in person and figured it out. It’s HAProxy doing a health check every 5 seconds :joy:

5 Likes

Health checks should set a proper User Agent, so this doesn’t happen again. Also, you can use the /srv/status route for health checking.

3 Likes

“Yeah, it should - but I’m lazy.” - SysAdmin

I’ll see what I can do. Thanks!

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.