Grok Pattern for Nginx logs


(Jack Zampolin) #1

It looks like discourse spits out custom Nginx logs. There wouldn’t happen to be a grok pattern for those logs anywhere? I looked and was unable to find anything.


(Matt Palmer) #2

I’m not aware of one. We haven’t written one ourselves because we collect all access log data through haproxy.


(Sam Saffron) #3

We have this :slight_smile:

It is in ruby and probably easily portable to an elastic grok rule


(Jack Zampolin) #4

Thanks for the pointer! I’ll drop this here for anyone looking:

I’m parsing the discourse logs with the telegraf logparser plugin. The following config works to parse this:

[[inputs.logparser]]
  files = [
    "/var/discourse/shared/standalone/log/var-log/nginx/access.log",
    "/var/discourse/shared/standalone/log/var-log/nginx/access.log.1"
  ]
  from_beginning = true
  [inputs.logparser.grok]
    measurement="nginx_access"
    patterns=[
      '''\[%{CUSTOM_TIMESTAMP:timestamp:ts-httpd}\] %{IP:client_ip} "%{NOTSPACE:method:tag} %{NOTSPACE:request_path} %{NOTSPACE:http_version}" "%{DATA:user_agent}" "%{DATA:discourse_route:tag}" %{NUMBER:response_code:tag} %{NUMBER:resp_bytes:int} "%{DATA:x_referrer}" %{NUMBER} %{NUMBER:resp_time:float} "%{DATA:username:tag}"'''
    ]
    custom_patterns = '''
      CUSTOM_TIMESTAMP %{MONTHDAY}/%{MONTH}/%{YEAR}:%{HOUR}:%{MINUTE}:%{SECOND} %{CUSTOM_TZ}
      CUSTOM_TZ [+-][0-9]{4}
    '''

The data is as follows in InfluxDB line protocol:

{
  measurement: nginx_access,
  tags: [
    method,
    discourse_route,
    response_code,
    username
  ],
  fields: {
    client_ip: string,
    request_path: string,
    http_version: string,
    user_agent: string,
    resp_bytes: int,
    x_referer: string,
   resp_time: float,
  }
}