1.摘要
本文主要介紹如何使用blackbox_exporter
的收集被監控主機的網站狀態、端口等信息,藉助 Prometheus 最終以儀表盤的形式顯示在 Grafana 中。
blackbox_exporter是Prometheus 官方提供的 exporter 之一,可以提供 http、dns、tcp、icmp
的監控數據採集。
2.blackbox_exporter 應用場景
HTTP 測試
定義 Request Header 信息
判斷 Http status / Http Respones Header / Http Body 內容TCP 測試
業務組件端口狀態監聽
應用層協議定義與監聽ICMP 測試
主機探活機制POST 測試
接口聯通性SSL 證書過期
時間
3. 安裝blackbox_exporter
3.1 各個版本的blackbox_exporter如下:
https://github.com/prometheus/blackbox_exporter/releases
以linux系統爲例,下載編譯好的二進制包,解壓使用:
$ wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.15.1/blackbox_exporter-0.15.1.linux-amd64.tar.gz
$ tar -xvf blackbox_exporter-0.15.1.linux-amd64.tar.gz
$ mv blackbox_exporter-0.15.1.linux-amd64 /usr/local/blackbox_exporter
3.2 驗證是否安裝成功
[root@izuf61mqd75uk09tjnh7dfz local]# cd blackbox_exporter/
[root@izuf61mqd75uk09tjnh7dfz blackbox_exporter]# ./blackbox_exporter --version
blackbox_exporter, version 0.15.1 (branch: HEAD, revision: 7dd86a593b5a2270e738be1654d9c112509e46ce)
build user: root@626ba8fd110c
build date: 20190917-12:31:25
go version: go1.13
3.3 創建systemd
服務
$ vim /lib/systemd/system/blackbox_exporter.service
[Unit]
Description=blackbox_exporter
After=network.target
[Service]
User=root
Type=simple
ExecStart=/usr/local/blackbox_exporter/blackbox_exporter --config.file=/usr/local/blackbox_exporter/blackbox.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
如果以非root
用戶運行blackbox_exporter,爲了使用icmp prober
,需要設置CAP_NET_RAW
,即對可執行文件blackbox_exporter執行下面的命令:
$ cd /usr/local/blackbox_exporter
$ setcap cap_net_raw+ep blackbox_exporter
3.4 啓動blackbox_exporter
$ systemctl daemon-reload
$ systemctl start blackbox_exporter
3.5 驗證是否啓動成功 默認監聽端口爲9115
$ systemctl status blackbox_exporter
$ netstat -lnpt|grep 9115
4. prometheus.yml中加入blackbox_exporter
4.1 監控網站
狀態
###vim /usr/local/prometheus/prometheus.yml
- job_name: web_status
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets: ['https://www.ssssss.cn']
labels:
instance: web_status
group: web
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: 172.19.14.253:9115
監控主機存活狀態:
$ vim /usr/local/prometheus/prometheus.yml
- job_name: 'node_status'
metrics_path: /probe
params:
module: [icmp]
static_configs:
- targets: ['172.19.14.253']
labels:
instance: 'node_status'
group: 'node'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
# - source_labels: [__param_target]
# target_label: instance
- target_label: __address__
replacement: 172.19.14.253:9115
監控主機端口存活狀態
$ vim /usr/local/prometheus/prometheus.yml
- job_name: 'port_status'
metrics_path: /probe
params:
module: [tcp_connect]
static_configs:
- targets: ['172.19.14.253:3306','172.19.14.253:80']
labels:
instance: 'port_status'
group: 'tcp'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
# - source_labels: [__param_target]
# target_label: instance
- target_label: __address__
replacement: 172.19.14.253:9115
4.2 檢查配置文件是否書寫正確
#### cd /usr/local/prometheus
[root@iZuf6ioqjurm6w0x1o7exjZ prometheus]# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: 0 rule files found
重新加載prometheus的配置
[root@iZuf6ioqjurm6w0x1o7exjZ prometheus]# systemctl restart prometheus
5. grafana中加入blackbox_exporter監控數據
5.1 導入blackbox_exporter
模板。
此模板爲9965
號模板,數據源選擇Prometheus 模板下載地址 https://grafana.com/grafana/dashboards/9965
此模板需要安裝餅狀圖插件 ,重啓grafana生效。
$ grafana-cli plugins install grafana-piechart-panel
$ service grafana-server restart
注意
!!!檢查此種安裝目錄是否在grafana插件目錄
下。
5.2 訪問grafana
6.告警
[root@iZ prometheus]# more prometheus.yml
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- 127.0.0.1:9093
rule_files:
- "rules/*.yml"
[root@iZ prometheus]# more rules/blackbox_exporter.yml
groups:
- name: blackbox_network_stats
rules:
- alert: blackbox_network_stats
expr: probe_success == 0
for: 1m #如1分鐘內持續爲0 報警
labels:
severity: critical
annotations:
description: 'Job {{ $labels.job }} 中的 網站/接口 {{ $labels.instance }} 已經down掉超過一分鐘.'
summary: '網站/接口 {{ $labels.instance }} down ! ! !'
參考:https://www.centoscn.vip/8412.html
https://blog.csdn.net/qq_43190337/article/details/100577728
https://blog.csdn.net/qq_25934401/article/details/84325356