prometheus + grafana 對flink 進行監控

原創

2021-01-30 10:50

prometheus + grafana 對flink 進行監控

標籤（空格分隔）： flink系列

一：flink監控簡介

二：Flink的Metric架構

三： prometheus + grafana 的對 flink 的監控部署構建

一：flink監控簡介

1.1 前言

Flink提供的Metrics可以在Flink內部收集一些指標，通過這些指標讓開發人員更好地理解作業或集羣的狀態。由於集羣運行後很難發現內部的實際狀況，跑得慢或快，是否異常等，開發人員無法實時查看所有的Task日誌，比如作業很大或者有很多作業的情況下，該如何處理？此時Metrics可以很好的幫助開發人員瞭解作業當前狀況。對於很多大中型企業來講，我們對進羣的作業進行管理時，更多的是關心作業精細化實時運行狀態。例如，實時吞吐量的同比環比、整個集羣的任務運行概覽、集羣水位，或者監控利用 Flink 實現的 ETL 框架的運行情況等，這時候就需要設計專門的監控系統來監控集羣的任務作業情況。

二： Flink的Metric架構

2.1 flink metric

Flink Metrics是Flink實現的一套運行信息收集庫，我們不但可以收集Flink本身提供的系統指標，比如CPU、內存、線程使用情況、JVM垃圾收集情況、網絡和IO等，還可以通過繼承和實現指定的類或者接口打點收集用戶自定義的指標。
通過使用Flink Metrics我們可以輕鬆地做到：
• 實時採集Flink中的Metrics信息或者自定義用戶需要的指標信息並進行展示；
• 通過Flink提供的Rest API收集這些信息，並且接入第三方系統進行展示。

2.2 監控架構

從Flink Metrics架構來看，指標獲取方式有兩種。一是REST-ful API，Flink Web UI中展示的指標就是這種形式實現的。二是reporter，通過reporter可以將metrics發送給外部系統。Flink內置支持JMX、Graphite、Prometheus等系統的reporter，同時也支持自定義reporter。
由於Flink Web UI所提供的metrics數量較少，也沒有時序展示，無法滿足實際生產中的監控需求。Prometheus+Grafana是業界十分普及的開源免費監控體系，上手簡單，功能也十分完善。

三：prometheus + grafana 的對 flink 的監控部署構建

3.1 安裝prometheus

Prometheus本身也是一個導出器(exporter)，提供了關於內存使用、垃圾收集以及自身性能
與健康狀態等各種主機級指標。
prometheus官網下載址：

https://prometheus.io/download/
wget https://github.com/prometheus/prometheus/releases/download/v2.21.0/prometheus-2.21.0.linux-amd64.tar.gz

# tar zxvf prometheus-2.21.0.linux-amd64.tar.gz

# mv prometheus-2.21.0.linux-amd64 /usr/local/prometheus

# chmod +x /usr/local/prometheus/prom*

# cp -rp /usr/local/prometheus/promtool /usr/bin/

3.2 配置prometheus

最後 加上pushgateway 收集：

此處將pushgateway 與 prometheus 安裝在一臺機器上面

- job_name: 'linux'
  static_configs:
  - targets: ['192.168.100.15:9100']
     labels:
        app: node05
        nodename: node05.vpc.flyfish.cn
        role: node
- job_name: 'pushgateway'
  static_configs:
  - targets: ['192.168.100.15:9091']
  labels:
     instance: 'pushgateway'

prometheus  的開機啓動：

cat > /usr/lib/systemd/system/prometheus.service <<EOF
[Unit]
Description=Prometheus
[Service]
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --
storage.tsdb.path=/usr/local/prometheus/data --web.enable-lifecycle --storage.tsdb.retention.time=180d
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
---

#service prometheus start 
#chkconfig prometheus on

3.3 安裝 prometheus 的node_exporter 與 pushgateway 的插件

  node_exporter :  

    #tar -zxvf node_exporter-1.0.1.linux-amd64.tar.gz
    #mv node_exporter-1.0.1.linux-amd64 /usr/local/node_exporter
    #/usr/local/node_exporter/node_exporter &

pushgateway:
     #tar -zxvf  pushgateway-1.2.0.linux-amd64.tar.gz
     #mv pushgateway-1.2.0.linux-amd64  /usr/local/pushgateway/
     # /usr/local/pushgateway/pushgateway &

###3.4 flink metric 的配置

flink-conf.yaml
到最後加上
----
metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
metrics.reporter.promgateway.host: 192.168.100.15
metrics.reporter.promgateway.port: 9091
metrics.reporter.promgateway.jobName: pushgateway
metrics.reporter.promgateway.randomJobNameSuffix: true
metrics.reporter.promgateway.deleteOnShutdown: true
----

然後同步所有flink的 works 節點

重啓flink 的集羣 

./stop-cluster.sh

./start-cluster.sh

3.4.1 打開pushgateway

3.4.2 prometheus 頁面

3.4.5 關於 grafana 的 prometheus 的Datasources

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

prometheus + grafana 對flink 進行監控

prometheus + grafana 對flink 進行監控

一：flink監控簡介

1.1 前言

二： Flink的Metric架構

2.1 flink metric

2.2 監控架構

三：prometheus + grafana 的對 flink 的監控部署構建

3.1 安裝prometheus

3.2 配置prometheus

3.3 安裝 prometheus 的node_exporter 與 pushgateway 的插件

3.4.1 打開pushgateway

3.4.2 prometheus 頁面

3.4.5 關於 grafana 的 prometheus 的Datasources

工作中用到的腳本合集

微服務實踐Aspire項目發佈到遠程k8s集羣

通過f-string編寫簡潔高效的Python格式化輸出代碼

[轉帖]20個常用的Linux工具命令

[轉帖]PostgreSQL從小白到高手教程 - 第46講：poc-tpch測試

24-5-18 X

華爲開發者大賽-昇騰AI初創大賽決賽暨星火計劃Online第二期來啦！

生活是什麼？ ——再讀《假如生活欺騙了你》

數倉模型設計詳細講解

prometheus + grafana 對flink 進行監控

人臉識別SpringBoot快遞代取平臺系統珍貴的一次開發經驗分享給大家

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

prometheus + grafana 對flink 進行監控

prometheus + grafana 對flink 進行監控

一：flink監控簡介

1.1 前言

二： Flink的Metric架構

2.1 flink metric

2.2 監控架構

三：prometheus + grafana 的 對 flink 的監控部署構建

3.1 安裝prometheus

3.2 配置prometheus

3.3 安裝 prometheus 的node_exporter 與 pushgateway 的插件

3.4.1 打開pushgateway

3.4.2 prometheus 頁面

3.4.5 關於 grafana 的 prometheus 的Datasources

三：prometheus + grafana 的對 flink 的監控部署構建