It looks like discourse spits out custom Nginx logs. There wouldn’t happen to be a grok
pattern for those logs anywhere? I looked and was unable to find anything.
I’m not aware of one. We haven’t written one ourselves because we collect all access log data through haproxy.
We have this
It is in ruby and probably easily portable to an elastic grok rule
5 Likes
Thanks for the pointer! I’ll drop this here for anyone looking:
I’m parsing the discourse logs with the telegraf
logparser
plugin. The following config works to parse this:
[[inputs.logparser]]
files = [
"/var/discourse/shared/standalone/log/var-log/nginx/access.log",
"/var/discourse/shared/standalone/log/var-log/nginx/access.log.1"
]
from_beginning = true
[inputs.logparser.grok]
measurement="nginx_access"
patterns=[
'''\[%{CUSTOM_TIMESTAMP:timestamp:ts-httpd}\] %{IP:client_ip} "%{NOTSPACE:method:tag} %{NOTSPACE:request_path} %{NOTSPACE:http_version}" "%{DATA:user_agent}" "%{DATA:discourse_route:tag}" %{NUMBER:response_code:tag} %{NUMBER:resp_bytes:int} "%{DATA:x_referrer}" %{NUMBER} %{NUMBER:resp_time:float} "%{DATA:username:tag}"'''
]
custom_patterns = '''
CUSTOM_TIMESTAMP %{MONTHDAY}/%{MONTH}/%{YEAR}:%{HOUR}:%{MINUTE}:%{SECOND} %{CUSTOM_TZ}
CUSTOM_TZ [+-][0-9]{4}
'''
The data is as follows in InfluxDB line protocol:
{
measurement: nginx_access,
tags: [
method,
discourse_route,
response_code,
username
],
fields: {
client_ip: string,
request_path: string,
http_version: string,
user_agent: string,
resp_bytes: int,
x_referer: string,
resp_time: float,
}
}
2 Likes