簡介
grok作爲一個logstash的過濾插件,支持根據正則表達式解析文本日誌行,拆成字段message結構化後再存儲,方便kibana的搜索和統計。
nginx日誌格式
.....
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}
[root@centos6 nginx]# cat /var/log/nginx/access.log 查看日誌輸出內容:
192.168.10.132 - - [08/Jul/2019:12:53:45 +0800] "GET /saudgsg/bujguj HTTP/1.0" 200 1201 "-" "ApacheBench/2.3" "-"
192.168.10.132 - - [08/Jul/2019:12:53:45 +0800] "GET /saudgsg/bujguj HTTP/1.0" 200 1201 "-" "ApacheBench/2.3" "-"
192.168.10.132 - - [08/Jul/2019:12:53:45 +0800] "GET /saudgsg/bujguj HTTP/1.0" 200 1201 "-" "ApacheBench/2.3" "-"
192.168.10.1 - - [08/Jul/2019:12:54:36 +0800] "GET /indexfsd HTTP/1.1" 200 1201 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KH
TML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0" "-"
192.168.10.1 - - [08/Jul/2019:12:54:36 +0800] "GET /favicon.ico HTTP/1.1" 200 1320 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0" "-"
192.168.10.1 - - [08/Jul/2019:12:54:36 +0800] "GET /favicon.ico HTTP/1.1" 200 1320 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0" "-"
192.168.10.1 - - [08/Jul/2019:12:54:36 +0800] "POST /bs/base/searchIndexImage.htm?v=1&device=10 HTTP/1.1" 502 575 "-" "Mozilla/5.0 (Windows NT 1
0.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0" "-"
192.168.10.1 - - [08/Jul/2019:12:54:36 +0800] "POST /bs/base/getArticleList.htm?v=1&device=10 HTTP/1.1" 502 575 "-" "Mozilla/5.0 (Windows NT 10.
0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0" "-"
192.168.10.1 - - [08/Jul/2019:12:54:36 +0800] "POST /bs/thirdparthy/getShareUrl.htm?t=1562561715347 HTTP/1.1" 502 575 "-" "Mozilla/5.0 (Windows
NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0" "-"
編寫文本過濾器
logstash中默認存在一部分正則表達式來讓我們套用,在如下的文件中我們可以看到:
/usr/local/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-patterns-core-4.1.2/patterns
其中最基本的定義是在grok-patterns中,但是某些正則不適合我們的nginx字段,此時就需要我們來自定義,然後grok通過patterns_dir來調用即可。 這裏截取部分的文本內容供參考文本寫法:
我這裏編寫了一個符合這臺nginx服務器的日誌過濾器,如果正則表達式不太熟的同學可以看下正則表達式-語法:
[root@centos6 patterns]# vim nginx-access
NGINXACCESS %{IP:clientip} - (%{USERNAME:user}|-) \[%{HTTPDATE:timestamp}\] \"%{WORD:request_verb} %{NOTSPACE:request} HTTP/%{NUMBER:httpversion
}\" %{NUMBER:status:int} %{NUMBER:body_sent:int} \"-\" \"%{GREEDYDATA:agent}\" \"-\"
編寫logstash配置文件
logstash基本格式 input >> codec >> filter >> codec >> output ,codec用於文字編碼格式轉換
[root@centos6 bin]# vim nginx_access.conf
input {
file {
path => "/var/log/nginx/access.log" #日誌文件路徑
}
}filter {
grok {
patterns_dir => "/usr/local/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-patterns-core-4.1.2/patterns" #模塊文件路徑
match => { "message" => "%{NGINXACCESS}" } #使用過濾的方法
remove_field => "message" #過濾後丟棄原有信息
}
}output {
stdout {
codec=>rubydebug #屏幕輸出調試
}
}
[root@centos6 bin]# ./logstash -f nginx_access.conf 啓動logstash日誌收集,並打開瀏覽器對nginx訪問。輸出內容如下:
左邊爲編寫過濾器時自定義的文本名稱和一些logstash自帶參數,右邊爲日誌文本過濾分段夠的內容。
調式無誤後對配置文件進一步修改,輸出到elasticsearch: