ElasticSearch進階（七）Logstash數據轉換工具的使用

原創

2020-06-28 00:27

前言

本章講解Logstash數據轉換工具的基本使用

方法

1.概念

通過準備篇的學習，我們知道Logstash基於Java，是一個開源的用於收集分析和存儲日誌的工具，它最重要的功能就是將我們收集的日誌做轉換，以便於我們更好的進行解析！

首先我們來看一下Logstash，下面的圖片來自於官網：https://www.elastic.co/cn/products/logstash

注意：本次示例將採集nginx的日誌作爲演示，請確保已經安裝好nginx

2.Logstash的安裝和配置測試

我們可以在官網下載指定版本的Logstash：https://www.elastic.co/cn/downloads/logstash

本次我們下載的是7.4.0的window版本：

首先我們進入到config路徑下，由於我的電腦內存限制，修改jvm.options指定數據如下：

## JVM configuration

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms256m
-Xmx256m

然後將logstash-sample.conf配置文件拷貝一份放入到bin路徑下，重命名爲logstash-file.conf，暫時將文件放在這，稍後修改！

我們在bin路徑下運行如下命令：logstash.bat -e "input { stdin { } } output { stdout {} }"

我相信大家知道這是什麼意思，標準化的輸入加標準化輸出

我們在控制檯輸入hello world，效果如下所示：

這說明，我們的logstash已經配置成功啦！

3.使用logstash將filebeat讀取nginx日誌輸出到elasticsearch

1）配置我們的filebeat配置文件

#=========================== Filebeat inputs =============================

filebeat.inputs:

- type: log
  enabled: true
  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - F:\nginx-1.17.5\logs\*.log

#==================== Elasticsearch template setting ==========================

setup.template.settings:
  index.number_of_shards: 1

#================================ Outputs =====================================

# Configure what output to use when sending the data collected by the beat.
#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["localhost:5044"]

使用命令啓動filebeat：filebeat.exe -e -c filebeat-nginx.yml -d "publish"

2）配置logstash的配置文件

找到我們之前拷貝到bin路徑下的配置文件logstash-file.conf，將其修改內容如下：

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
  beats {
    port => 5044
  }
}

output {
   elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "filebeat-test"
  }
}

使用命令啓動logstash：logstash.bat -f logstash-file.conf --config.reload.automatic

3）啓動es，觀察es數據

我們發現，傳輸進入ES已經成功了，只不過我們的日誌數據沒有合理的解析，全部都包含在了message屬性中，這對我們今後的解析是十分不利的，所以我們需要用到logstash的filter。

要想學習filter，首先就需要知道logstash的工作原理，一張圖詮釋了它的運行流程：

我們發現，它包含三個主要結構：INPUTS、FILTERS、OUTPUTS。

INPUTS：輸入數據到logstash。
FILTERS：數據中間處理，對數據進行操作。
OUTPUTS：outputs是logstash處理管道的最末端組件。

其中，最難學的就是FILTERS的編寫啦！

4）編寫基本的FILTERS，轉換message屬性中的日誌信息

本次示例使用的正則表達式來自於：https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/httpd

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
  beats {
    port => 5044
  }
}

filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}"}
    }
}

output {
   elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "filebeat-test"
  }
}

本次無需重新啓動logstash，因爲我們配置了--config.reload.automatic

重新刷新幾次nginx頁面，觀察後序的輸出效果：

我們發現，logstash爲我們新增了一些有用的屬性將message的內容分解開來，有助於日後的分析與彙總。

更多過濾器的編寫方法請參考官網：https://www.elastic.co/guide/en/logstash/current/index.html

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

ElasticSearch進階（七）Logstash數據轉換工具的使用

前言

方法

工作中用到的腳本合集

24-5-18 X

Spring Cloud入門（六）Spring Cloud Config

RabbitMQ入門（二）RabbitMQ+SpringBoot的基本使用

Spring Cloud入門（七）Spring Cloud Bus

RabbitMQ入門（八）RabbitMQ的消息確認ACK

RabbitMQ入門（一）RabbitMQ的安裝

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結