一 logstash安裝
1.1下載包
[root@node1 ~]# cd /usr/local/src/
[root@node1 src]# wget https://artifacts.elastic.co/downloads/logstash/logstash-7.4.2.tar.gz
[root@node1 src]# tar -xf logstash-7.4.2.tar.gz
[root@node1 src]# mv logstash-7.4.2 /usr/local/logstash
[root@node1 src]# cd /usr/local/logstash
1.2 查看Java環境
[root@node1 logstash]# java -version
openjdk version "1.8.0_232" OpenJDK Runtime Environment (build 1.8.0_232-b09) OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)
1.3 啓動簡單測試
簡單的logstash案例
[root@node1 logstash]# ./bin/logstash -e 'input {stdin {} } output { stdout {}}'
啓動相對比較慢
啓動後,可以在控制檯輸入,就可以在控制檯輸出,沒有filter的過濾功能
hello { "@timestamp" => 2019-11-30T06:38:30.431Z, "@version" => "1", "host" => "node1", "message" => "hello" } nihao { "@timestamp" => 2019-11-30T06:39:23.965Z, "@version" => "1", "host" => "node1", "message" => "nihao" } logstash test { "@timestamp" => 2019-11-30T06:39:36.067Z, "@version" => "1", "host" => "node1", "message" => "logstash test" }
二 logstash的配置
2.1 logstash的配置文件結構
Logstash配置文件針對要添加到事件處理管道中的每種插件類型都有一個單獨的部分
# This is a comment. You should use comments to describe
# parts of your configuration.
input { #輸入
stdin {...} 標準輸入
}
filter { #過濾,對數據進行分析,截取等處理
...
}
output { #輸出
stdout {...} #標準輸出
}
每個部分都包含一個或多個插件的配置選項。如果指定多個過濾器,則會按照它們在配置文件中出現的順序進行應用。
2.2 插件配置
插件的配置包括插件名稱,後跟該插件的一組設置。例如,此輸入部分配置兩個文件輸入:
input { file { path => "/var/log/messages" type => "syslog" } file { path => "/var/log/apache/access.log" type => "apache" } }
編解碼器是用於表示數據的Logstash編解碼器的名稱。編解碼器可用於輸入和輸出。
輸入編解碼器提供了一種在數據輸入之前解碼數據的便捷方法。輸出編解碼器提供了一種方便的方式,可以在數據離開輸出之前對其進行編碼。使用輸入或輸出編解碼器,無需在Logstash管道中使用單獨的過濾器。
codec => "json"
2.3 官方的一個配置示例
以下示例說明了如何配置Logstash來過濾事件,處理Apache日誌和syslog消息,以及使用條件控制過濾器或輸出處理哪些事件
配置過濾器
篩選器是一種在線處理機制,可靈活地對數據進行切片和切塊以適應您的需求。讓我們看一下一些實際使用的過濾器。以下配置文件設置grok
和date
過濾器
[root@node1 logstash]# vi logstash-filter.conf
input { stdin { } } filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] } } output { elasticsearch { hosts => ["localhost:9200"] } stdout { codec => rubydebug } }
運行
[root@node1 logstash]# bin/logstash -f logstash-filter.conf
將以下行粘貼到您的終端中,然後按Enter鍵,它將由stdin輸入處理:
127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"
應該看到返回到stdout的內容,如下所示:
127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0" { "clientip" => "127.0.0.1", "auth" => "-", "verb" => "GET", "request" => "/xampp/status.php", "bytes" => "3891", "host" => "node1", "@timestamp" => 2013-12-11T08:01:45.000Z, "message" => "127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/status.php HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"", "@version" => "1", "response" => "200", "ident" => "-", "httpversion" => "1.1", "referrer" => "\"http://cadenza/xampp/navi.php\"", "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"", "timestamp" => "11/Dec/2013:00:01:45 -0800" }
如上所見,Logstash(在grok
過濾器的幫助下)能夠解析日誌行(碰巧是Apache“組合日誌”格式),並將其分解爲許多不同的信息位。一旦開始查詢和分析我們的日誌數據,這將非常有用。例如,將能夠輕鬆地運行有關HTTP響應代碼,IP地址,引薦來源網址等的報告。Logstash包含許多現成的grok模式,因此如果您需要解析一種通用的日誌格式,很可能有人已經完成了工作。
2.4 處理Apache日誌
input { file { path => "/tmp/access_log" start_position => "beginning" } } filter { if [path] =~ "access" { mutate { replace => { "type" => "apache_access" } } grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } } date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] } } output { elasticsearch { hosts => ["localhost:9200"] } stdout { codec => rubydebug } }
運行
[root@node1 logstash]# bin/logstash -f logstash-apache.conf
[2019-11-30T02:28:42,685][INFO ][filewatch.observingtail ][main] START, creating Discoverer, Watch with file and sincedb collections [2019-11-30T02:28:43,130][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
然後,使用以下日誌條目(或使用您自己的Web服務器中的某些日誌)輸入
71.141.244.242 - kurt [18/May/2011:01:48:10 -0700] "GET /admin HTTP/1.1" 301 566 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3" 134.39.72.245 - - [18/May/2011:12:40:18 -0700] "GET /favicon.ico HTTP/1.1" 200 1189 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)" 98.83.179.51 - - [18/May/2011:19:35:08 -0700] "GET /css/main.css HTTP/1.1" 200 1837 "http://www.safesand.com/information.htm" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"
輸入
[root@node1 ~]# echo '71.141.244.242 - kurt [18/May/2011:01:48:10 -0700] "GET /admin HTTP/1.1" 301 566 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"' >> /tmp/access_log
[root@node1 ~]# echo '134.39.72.245 - - [18/May/2011:12:40:18 -0700] "GET /favicon.ico HTTP/1.1" 200 1189 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)"' >> /tmp/access_log
[root@node1 ~]# echo '98.83.179.51 - - [18/May/2011:19:35:08 -0700] "GET /css/main.css HTTP/1.1" 200 1837 "http://www.safesand.com/information.htm" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"' /tmp/access_log
98.83.179.51 - - [18/May/2011:19:35:08 -0700] "GET /css/main.css HTTP/1.1" 200 1837 "http://www.safesand.com/information.htm" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1" /tmp/access_log
查看
查看剛剛的數據
查看原數據
{ "_index": "logstash", "_type": "_doc", "_id": "WK08u24BcnOPLK2r2Wfj", "_version": 1, "_score": 1, "_source": { "request": "/favicon.ico", "agent": ""Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)"", "ident": "-", "clientip": "134.39.72.245", "verb": "GET", "type": "apache_access", "path": "/tmp/access_log", "@version": "1", "message": "134.39.72.245 - - [18/May/2011:12:40:18 -0700] "GET /favicon.ico HTTP/1.1" 200 1189 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)"", "timestamp": "18/May/2011:12:40:18 -0700", "bytes": "1189", "referrer": ""-"", "auth": "-", "httpversion": "1.1", "response": "200", "host": "node1", "@timestamp": "2011-05-18T19:40:18.000Z" } }
對於官方的示例還有很多,後面在慢慢實驗
三 讀取自定義結構的日誌
前面我們通過 Filebeat讀取了 nginx的日誌,如果是自定義結構的日誌,就需要讀取處理後才能使用,所以,這個時候就需要使用 Logstash了,因爲 Logstash有着強大的處理能力,可以應對各種各樣的場景。
3.1 日誌結構
2019-03-15 21:21:21 ERROR|讀取數據出錯|參數:id=1002
可以看到,日誌中的內容是使用”進行分割的,使用,我們在處理的時候,也需要對數據做分割處理。
3.2 編寫配置文件切割日誌
[root@node1 logstash]# vi pipeline.conf
input { file { path => "/tmp/access_log" start_position => "beginning" } } filter { mutate { split => {"message" => "|"} } } output { stdout { codec => rubydebug } }
啓動測試
[root@node1 logstash]# bin/logstash -f pipeline.conf
[2019-11-30T03:19:10,634][INFO ][filewatch.observingtail ][main] START, creating Discoverer, Watch with file and sincedb collections [2019-11-30T03:19:11,079][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600} /usr/local/logstash/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
輸入日誌
[root@node1 ~]# echo "2019-03-15 21:21:21 ERROR|讀取數據出錯|參數:id=1002" >>/tmp/access_log
輸出顯示
{ "message" => [ [0] "2019-03-15 21:21:21 ERROR", [1] "讀取數據出錯", [2] "參數:id=1002" ], "@version" => "1", "host" => "node1", "@timestamp" => 2019-11-30T08:20:06.666Z, "path" => "/tmp/access_log" }
3.3 在把分割的日誌做一個標記
[root@node1 logstash]# vi pipeline.conf
input { file { path => "/tmp/access_log" start_position => "beginning" } } filter { mutate { split => {"message" => "|"} } mutate { add_field =>{ "Time" => "%{message[0]}" "result" => "%{message[1]}" "userID" => "%{message[2]}" } } } output { stdout { codec => rubydebug } }
[root@node1 logstash]# bin/logstash -f pipeline.conf
這種個配置會出現錯誤,而且不能出現添加的filed
[2019-11-30T04:13:11,739][WARN ][logstash.filters.mutate ][main] Exception caught while applying mutate filter {:exception=>"Invalid FieldReference: `message[0]`"}
修改pipelie.conf如下
input { file { path => "/tmp/access_log" start_position => "beginning" } } filter { mutate { split => {"message" => "|"} } mutate { add_field => { "Date" => "%{[message][0]}" "Leverl" => "%{[message][1]}" "result" => "%{[message][2]}" "userID" => "%{[message][3]}" } } } output { stdout { codec => rubydebug } }
運行
[root@node1 logstash]# bin/logstash -f pipeline.conf
輸入日誌
[root@node1 ~]# echo "2019-03-15 21:21:21| ERROR|讀取數據出錯|參數:id=1002" >>/tmp/access_log
輸出臺結果
{ "path" => "/tmp/access_log", "@version" => "1", "Date" => "2019-03-15 21:21:21", "host" => "node1", "message" => [ [0] "2019-03-15 21:21:21", [1] " ERROR", [2] "讀取數據出錯", [3] "參數:id=1002" ], "@timestamp" => 2019-11-30T09:18:19.569Z, "Leverl" => " ERROR", "result" => "讀取數據出錯", "userID" => "參數:id=1002" }
logstash的實驗先做到這裏,後面在綜合做一個實驗