Unidentified Crawler with High Amounts of Pageviews

Hi all!

When we look at our crawler page view count, there seems to be an unidentified entry that is racking up over 500k page views in a month:

Is there any easy way to find out what this could be? It seems to average some 10,000-15,000+ hits a day.

2 לייקים

If you are on our hosting email support and we can handle it for you.

לייק 1

We’re a self hosted FLOSS project, so I suppose that’s out of the question :slight_smile:

I know I could put some more filtering on there and get our infra guy to look at more logs - I guess I was just wondering if anyone else had seen this before.

3 לייקים

Check the nginx logs (access.log) for requests coming with that user agent and the respective IP.

4 לייקים

I just met with our lead SysAdmin in person and figured it out. It’s HAProxy doing a health check every 5 seconds :joy:

5 לייקים

Health checks should set a proper User Agent, so this doesn’t happen again. Also, you can use the /srv/status route for health checking.

3 לייקים

“Yeah, it should - but I’m lazy.” - SysAdmin

I’ll see what I can do. Thanks!

2 לייקים

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.