Unidentified Crawler with High Amounts of Pageviews

Hi all!

When we look at our crawler page view count, there seems to be an unidentified entry that is racking up over 500k page views in a month:

Is there any easy way to find out what this could be? It seems to average some 10,000-15,000+ hits a day.

If you are on our hosting email support and we can handle it for you.

We’re a self hosted FLOSS project, so I suppose that’s out of the question :slight_smile:

I know I could put some more filtering on there and get our infra guy to look at more logs - I guess I was just wondering if anyone else had seen this before.


Check the nginx logs (access.log) for requests coming with that user agent and the respective IP.


I just met with our lead SysAdmin in person and figured it out. It’s HAProxy doing a health check every 5 seconds :joy:


Health checks should set a proper User Agent, so this doesn’t happen again. Also, you can use the /srv/status route for health checking.


“Yeah, it should - but I’m lazy.” - SysAdmin

I’ll see what I can do. Thanks!


