上一篇搭建起了一個簡單的健康指標檢查.這一節繼續整合;
所需工具下載
搭建監控平臺所需要的工具:
grafana: 監控數據的視圖展示;
官網下載地址: https://grafana.com/
prometheus: 監控數據採集;
官網下載地址: https://prometheus.io/download/#prometheus
node_exporter : 數據導出器
官網下載地址: https://prometheus.io/download/#node_exporter
consul: 服務發現
官網下載地址: https://www.consul.io/
服務器搭建監控平臺
安裝卸載腳本編寫
爲了方便環境遷移或者他人用起來方便,這裏我做成一鍵安裝部署, 一鍵啓動,一鍵卸載;
將下載好的工具上傳至服務器指定目錄.在這裏我的目錄是/data/monitor,便於管理;
目錄下有 install目錄, exporter-install兩個目錄;
1: 將grafana,prometheus, consul安裝包上傳至/install 目錄下, 在此處寫安裝腳本;
腳本名: install-monitor.sh
#!/bin/bash
installdir="/data/monitor"
shpath=$0
toolsdir=${shpath%/*}
echo "$toolsdir"
cd $toolsdir
echo "======================unpackaging monitor app to $installdir/app ========================="
mkdir -p $installdir/app
rm -r -f $installdir/app/*
mkdir -p $installdir/app/grafana
mkdir -p $installdir/app/prometheus
mkdir -p $installdir/app/consul
tar -zxvf grfana-6.1.3.linux-amd64.tar.gz -C $installdir/app/grafana
mv $installdir/app/grafana/grafana*/* $installdir/app/grafana
rmdir $installdir/app/grafana/grafana*
tar -zxvf prometheus-2.8.1.linux-amd64.tar.gz -C $installdir/app/prometheus
mv $installdir/app/prometheus/prometheus*/* $installdir/app/prometheus
rmdir $installdir/app/prometheus/prometheus-*
unzip consul_1.4.4_linux_amd64.zip -d $installdir/app/consul
mkdir -p $installdir/data
rm -r -f $installdir/data/*
mkdir -p $installdir/data/consul
mkdir -p $installdir/data/prometheus
mkdir -p $installdir/data/grafana
mkdir -p $installdir/cfg
rm -r -f $installdir/cfg/*
mkdir -p $installdir/cfg/consul
mkdir -p $installdir/cfg/prometheus
mkdir -p $installdir/cfg/grafana
mkdir -p $installdir/log
mkdir -p $installdir/bin
echo "==============================unpackage monitor app success ============================="
2: 順道把卸載腳本也編寫,卸載之前需要先一鍵停止, 啓動停止腳本下面會編寫;
腳本名: uninstall-monitor.sh
#!/bin/bash
installdir="/data/monitor"
shpath=$0
toolsdir=${shpath%/*}
echo "$toolsdir"
cd $toolsdir
echo "==========================uninstall app to $installdir/app=============================="
rm -r -f $installdir/app
rm -r -f $installdir/cfg
rm -r -f $installdir/data
rm -r -f $installdir/log
echo "==========================uninstall app success=============================="
3: 退回上一次目錄,在exporter-install目錄下編寫node-exporter安裝與卸載腳本:
腳本名: install-node-exporter.sh
#!/bin/bash
installdir="/data/monitor"
shpath=$0
toolsdir=${shpath%/*}
echo "$toolsdir"
cd $toolsdir
echo "======================unpackaging monitor exporter to $installdir/exporter ========================="
mkdir -p $installdir/exporter
mkdir -p $installdir/exporter/node_exporter
rm -r -f $installdir/exporter/node_exporter/*
mkdir -p $installdir/log
mkdir -p $installdir/bin
tar -zxvf node_exporter-0.17.0.linux-amd64.tar.gz -C $installdir/exporter/node_exporter
mv $installdir/exporter/node_exporter/node_exporter*/* $installdir/exporter/node_exporter
rmdir $installdir/exporter/node-exporter/node_exporter-*
echo "==============================unpackage monitor exporter success ============================="
4: exporter 卸載腳本:
腳本名: uninstall-node-exporter.sh
#!/bin/bash
installdir="/data/monitor"
shpath=$0
toolsdir=${shpath%/*}
echo "$toolsdir"
cd $toolsdir
echo "======================uninstall monitor exporter to $installdir/exporter ========================="
rm -r -f $installdir/exporter/node_exporter
echo "==============================unpackage monitor exporter success ============================="
配置文件
將安裝包中的配置文件各複製一份放於data/monitor/cfg/下對應的目錄中, 保持原始配置文件不變;
1: grafana配置文件,這裏主要修改host,默認3000端口,這裏就先不修改了;需要修改的可以參考:
https://grafana.com/docs/installation/configuration/
2: prometheus 配置文件,配置需要監控的任務: 參考:https://prometheus.io/docs/prometheus/latest/configuration/configuration/
核心配置代碼:
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:7002']
# 監控服務器的指標
- job_name: 'node_server'
static_configs:
- targets: ['localhost:9100']
# 通過consul 註冊中心獲取拉取地址
- job_name: 'metrics'
metrics_path: /monitor/actuator/prometheus
consul_sd_configs:
- server: localhost:7001
tag: metrics
relabel_configs:
- source_labels: ["__meta_consul_service"]
regex: "(.*)"
replacement: $1
action: replace
target_label: "service"
3: consul 配置文件: 這裏配置主要部分,對於key的描述,可以參考官網:
https://www.consul.io/docs/agent/options.html
{
"bootstrap_expext" : 1,
"data_dir" : "/data/monitor/data/consul",
"pid_file" : "/data/monitor/data/consul/consul.pid",
"node_name" : "agent-one",
"bind_addr" : "127.0.0.1",
"client_addr" : "0.0.0.0",
"ports" : {
"http" : 7001
},
"ui" : true
}
到此爲止,配置文件編寫完成;
一鍵啓動,停止腳本
1: grafana 啓動腳本
#!/bin/bash
monitorpath="data/monitor"
cd ${monitorpath}/app/grafana
echo "start grafana at : `date` " >${monitorpath}/log/grafana_runtime.log
nohup ${monitorpath}/app/grafana/bin/grafana-server --config ${monitorpath}/cfg/grafana/custom.ini > ${monitorpath}/log/grafana.log 2>&1 &
echo $! > ${monitorpath}/log/grafana.pid
2:grafana 停止腳本
#!/bin/bash
echo "stop program at : `date` " > /data/monitor/log/grafana_runtime.log
if [ -f "/data/monitor/log/grafana.pid" ]; then
kill -9 `cat /data/monitor/log/grafana.pid`
rm /data/monitor/log/grafana.pid
else
echo "file grafana.pid not exist"
fi
3: prometheus啓動腳本
#!/bin/bash
monitorpath="/data/monitor"
cd ${monitorpath}/app/prometheus
echo "start prometheus at : `date` " > ${monitorpath}/log/prometheus_runtime.log
nohup ${monitorpath}/app/prometheus/prometheus --web.listen-address=:7002 \
--config.file=${monitorpath}/cfg/prometheus/prometheus.yml \
--web.read-timeout=5m \
--web.enable-admin-api \
--web.max-connection=10 \
--query.timeout=2m \
--query.max-concurreny=20 \
--storage.tsdb.path=${monitorpath}/data/prometheus/ > ${monitorpath}/log/prometheus.log 2>&1 &
echo $! > ${monitorpath}/log/prometheus.pid
4: prometheus停止腳本
#!/bin/bash
echo "stop prometheus at : `date` " > /data/monitor/log/prometheus_runtime.log
if [ -f "/data/monitor/log/prometheus.pid" ]; then
kill -9 `cat /data/monitor/log/prometheus.pid`
rm /data/monitor/log/prometheus.pid
else
echo "file prometheus.pid not exist"
fi
5: node_exporter 啓動腳本
#!/bin/bash
monitorpath="/data/monitor"
cd ${monitorpath}/exporter/node_exporter
echo "start node_exporter at : `date` " > ${monitorpath}/log/node_exporter_runtime.log
nohup ${monitorpath}/exporter/node_exporter/node_exporter > ${monitorpath}/log/node_exporter.log 2>&1 &
echo $! > ${monitorpath}/log/node_exporter.pid
6: node_exporter 停止腳本
#!/bin/bash
echo "stop node_exporter at : `date` " > /data/monitor/log/node_exporter_runtime.log
if [ -f "/data/monitor/log/node_exporter.pid" ]; then
kill -9 `cat /data/monitor/log/node_exporter.pid`
rm /data/monitor/log/node_exporter.pid
else
echo "file node_exporter.pid not exist"
fi
7: consul 啓動腳本
#!/bin/bash
monitorpath="/data/monitor"
cd ${monitorpath}/app/consul
echo "start consul at : `date` " > ${monitorpath}/log/consul_runtime.log
nohup ${monitorpath}/app/consul/consul agent -server -config-dir="${monitorpath}/cfg/consul" > ${monitorpath}/log/consul.log 2>&1 &
echo $! > ${monitorpath}/log/consul.pid
8: consul停止腳本
#!/bin/bash
echo "stop consul at: `date` " > /data/monitor/log/consul_runtime.log
if [ -f "/data/monitor/log/consul.pid" ]; then
kill -9 `cat /data/monitor/log/consul.pid`
rm /data/monitor/log/consul.pid
else
echo "file consul.pid not exist"
fi
9: 一鍵啓動腳本
#!/bin/bash
cd `dirname $0`
# start node_exporter
echo "start node_exporter"
./start_node_exporter.sh
#start consul
echo "start consul"
./start_consul.sh
#start prometheus
echo "start prometheus"
./start_prometheus.sh
#start grafana
echo "start grafana"
./start_grafana.sh
echo "see logs at /data/monitor/log"
10: 一鍵停止
#!/bin/bash
cd `dirname $0`
# stop grafana
echo "stop grafana"
./stop_grafana.sh
# stop prometheus
echo "stop prometheus"
./stop_prometheus.sh
# stop consul
echo "stop consul"
./stop_consul.sh
# stop node_exporter
echo "stop node_exporter"
./stop_node_exporter.sh
echo "see logs at /data/monitor/log"
到此所有基礎腳本編寫完畢. 進入/data/monitor/bin目錄,一鍵啓動,可以檢查是否正常運行了,
ps -ef | grep consul ,
ps -ef | grep prometheus ,
ps -ef | grep grafana,
ps -ef | grep node_export,
也可以訪問查看數據, curl localhost:9100/metrics .
由於篇幅問題, 將緊接着的內容寫在下一篇;