簡介
本文介紹了基於prometheus+keepalived+haproxy+m3db集羣實現的監控高可用方案。
本文將帶大家一步一步的實現基於prometheus的監控高可用集羣(注重實戰,非必要概念不講)
你將得到一個無單點故障,可以監控物理機信息,openstack虛擬機信息,openstack服務,mysql,memcache,rabbitmq等多種信息的高可用監控集羣,並且監控數據爲遠端存儲,具有容災性,可聚合性(多服務監控數據聚合爲一份,數據更加詳細並且多監控點數據完全統一)。
慣例,上架構圖,自己畫的湊合看,pro2節點和pro1節點配置一樣,線太多太亂就沒畫:
架構圖
簡單的解釋一下,理解不了沒關係,動手裝完自然就理解了。
被監控層
這裏指的是被監控節點,當然生產環境肯定不止這一個被監控節點,每個被監控節點上面都跑着各種exporter,其實就是各種服務的數據採集器。
核心監控層
Prometheus server會向各個被監控節點的exporter拉取數據(當然你可以通過exporter向pushgateway來主動推數據)。
haproxy+keepalived相信大家都很熟悉了,就是將多個prometheus server組成多備集羣,使其具有高可用功能。
m3coordinator是專門向M3DB集羣讀寫數據的組件,我們用他來向M3DB集羣寫數據。
m3query是專門用來查詢M3DB數據的,可以查詢聚合數據,並且兼容Prometheus特有的prosql語法,非常方便,我們用它向M3DB集羣讀數據。
數據持久化層
因爲Prometheus本身收集到的數據是存儲在本地tsdb時序數據庫,無法進行大量數據持久化,數據損壞之後也無法恢復,這很明顯無法滿足我們對於高可用的需求,所以博主這裏選擇了M3DB作爲持久化數據庫。最上層就是M3DB集羣,是一個通過etcd來維持集羣狀態的多活數據庫,etcd節點最好爲3,M3DB node節點要求最小爲3,無最大上限要求。
目錄
在pro01和pro02安裝haproxy和keepalived
在m3db01,m3db02,m3db03節點配置m3db需要的內核參數
在m3db01,m3db02,m3db03節點安裝M3DB集羣並啓動
在任意一個m3db node節點初始化m3db並且創建namespace
在pro01,pro02節點上安裝m3coordinator,並啓動
實戰開始:
準備工作
centos1708機器 * 6 :
172.27.124.66 m3db01
172.27.124.67 m3db02
172.27.124.68 m3db03
172.27.124.69 pro01
172.27.124.70 pro02
172.27.124.72 controller
keepalived使用的虛擬IP * 1 :
172.27.124.71
各種安裝包:(當你看到的時候,很可能一些包已經迭代了多個版本,請自行更新版本,這裏是舉例)
prometheus-2.12.0-rc.0.linux-amd64.tar.gz
下載地址:https://github.com/prometheus/prometheus/releases/tag/v2.12.0
pushgateway-0.9.1.linux-amd64.tar.gz
下載地址:https://github.com/prometheus/pushgateway/releases/tag/v0.9.1
collectd_exporter-0.4.0.linux-amd64.tar.gz
下載地址:https://github.com/prometheus/collectd_exporter/releases/tag/v0.4.0
node_exporter-0.18.1.linux-amd64.tar.gz
下載地址:https://github.com/prometheus/node_exporter/releases/tag/v0.18.1
mysqld_exporter-0.12.1.linux-amd64.tar.gz
下載地址:https://github.com/prometheus/mysqld_exporter/releases/tag/v0.12.1
rabbitmq_exporter-0.29.0.linux-amd64.tar.gz
下載地址:https://github.com/kbudde/rabbitmq_exporter/releases/tag/v0.29.0
openstack-exporter-0.5.0.linux-amd64.tar.gz
下載地址:https://github.com/openstack-exporter/openstack-exporter/releases/tag/v0.5.0
memcached_exporter-0.6.0.linux-amd64.tar.gz
下載地址:https://github.com/prometheus/memcached_exporter/releases/tag/v0.6.0
haproxy_exporter-0.10.0.linux-amd64.tar.gz
下載地址:https://github.com/prometheus/haproxy_exporter/releases/tag/v0.10.0
注意:exporter不止上面這點兒,還有很多,有需求自己去官網找。https://prometheus.io/docs/instrumenting/exporters/
yajl-2.0.4-4.el7.x86_64.rpm
collectd-5.8.1-1.el7.x86_64.rpm
collectd-virt-5.8.1-1.el7.x86_64.rpm
這仨包自己去http://rpmfind.net/linux/rpm2html/search.php上面搜
keepalived和haproxy安裝包就不說了,直接用yum安裝就行。
m3_0.13.0_linux_amd64.tar.gz
下載地址:https://github.com/m3db/m3/releases/tag/v0.13.0
當然如果你懶得下,我已經打包好了:https://download.csdn.net/download/u014706515/11833017
安裝步驟:
將所有安裝包拷貝到所有六臺機器的/home/pkg下。設置所有服務器的hostname,並保證不重名。
所有機器素質3連:關防火牆,關selinux,安裝ntp並同步時間。
首先安裝Prometheus服務和pushgateway服務
在pro01和pro02安裝Prometheus
cd /home/
#解壓安裝包
tar -zxvf /home/pkg/prometheus-2.12.0-rc.0.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/pushgateway-0.9.1.linux-amd64.tar.gz -C /usr/local/
#重命名
mv /usr/local/prometheus-2.12.0-rc.0.linux-amd64/ /usr/local/prometheus
mv /usr/local/pushgateway-0.9.1.linux-amd64/ /usr/local/pushgateway
#添加Prometheus權限
groupadd prometheus
useradd -g prometheus -s /sbin/nologin prometheus
mkdir -p /var/lib/prometheus-data
chown -R prometheus:prometheus /var/lib/prometheus-data
chown -R prometheus:prometheus /usr/local/prometheus/
chown -R prometheus:prometheus /usr/local/pushgateway
#添加Prometheus配置,注意配置中的中文解釋,請根據環境進行變量替換
cat > /usr/local/prometheus/prometheus.yml << EOF
global:
scrape_interval: 20s
scrape_timeout: 5s
evaluation_interval: 10s
scrape_configs:
- job_name: prometheus
scrape_interval: 5s
static_configs:
- targets:
- 172.27.124.69:9090 #這裏填該prometheus服務器的地址,本例是指pro01
labels:
instance: prometheus
- job_name: pushgateway
scrape_interval: 5s
static_configs:
- targets:
- 172.27.124.69:9091 #這裏填該prometheus服務器的地址,本例是指pro01
labels:
instance: pushgateway
- job_name: node_exporter
scrape_interval: 10s
static_configs:
- targets:
- 172.27.124.72:9100 #這裏填exporter所在被監控服務器的地址,本例是指controller
labels:
instance: controller #被監控服務器的名子,爲了區分監控的哪臺機器,推薦填hostname
- job_name: haproxy_exporter
scrape_interval: 5s
static_configs:
- targets:
- 172.27.124.72:9101 #這裏填exporter所在被監控服務器的地址,本例是指controller
labels:
instance: controller #被監控服務器的名子,爲了區分監控的哪臺機器,推薦填hostname
- job_name: rabbitmq_exporter
scrape_interval: 5s
static_configs:
- targets:
- 172.27.124.72:9102 #這裏填exporter所在被監控服務器的地址,本例是指controller
labels:
instance: controller #被監控服務器的名子,爲了區分監控的哪臺機器,推薦填hostname
- job_name: collectd_exporter
scrape_interval: 5s
static_configs:
- targets:
- 172.27.124.72:9103 #這裏填exporter所在被監控服務器的地址,本例是指controller
labels:
instance: controller #被監控服務器的名子,爲了區分監控的哪臺機器,推薦填hostname
- job_name: mysqld_exporter
scrape_interval: 5s
static_configs:
- targets:
- 172.27.124.72:9104 #這裏填exporter所在被監控服務器的地址,本例是指controller
labels:
instance: controller #被監控服務器的名子,爲了區分監控的哪臺機器,推薦填hostname
- job_name: memcached_exporter
scrape_interval: 5s
static_configs:
- targets:
- 172.27.124.72:9105 #這裏填exporter所在被監控服務器的地址,本例是指controller
labels:
instance: controller #被監控服務器的名子,爲了區分監控的哪臺機器,推薦填hostname
EOF
# 接下來註冊Prometheus爲系統服務,可以使用systemctl
cat > /usr/lib/systemd/system/prometheus.service << EOF
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus-data
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
# 註冊pushgateway服務
cat > /usr/lib/systemd/system/pushgateway.service << EOF
[Unit]
Description=pushgateway
After=local-fs.target network-online.target
Requires=local-fs.target network-online.target
[Service]
Type=simple
ExecStart=/usr/local/pushgateway/pushgateway
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
#開機自啓與啓動
systemctl daemon-reload
systemctl enable prometheus
systemctl enable pushgateway
systemctl start prometheus
systemctl start pushgateway
在pro01和pro02安裝haproxy和keepalived
# 直接yum安裝,大衆軟件,源生yum源就有
yum -y install haproxy keepalived
# 配置haproxy,注意配置中的中文註釋
echo "
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
listen prometheus-server
bind 172.27.124.71:9090 #這個是虛擬VIP的地址,端口可以自選,別衝突就行
balance roundrobin
option tcpka
option httpchk
option httplog
server pro01 172.27.124.69:9090 check inter 2000 rise 2 fall 5 #填pro01的地址
server pro02 172.27.124.70:9090 check inter 2000 rise 2 fall 5 #填pro02的地址
" > /etc/haproxy/haproxy.cfg
# 配置keepalived,注意配置中的中文註釋
echo "
vrrp_script chk_pro_server_state {
script \"/etc/prometheus/check_pro.sh\" # 檢測腳本的路徑。腳本在後面介紹
interval 5
fall 2
rise 2
}
vrrp_instance haproxy_vip {
state MASTER
interface enp0s3 #代理VIP的網卡名,不懂的用'ip a'指令查看,你機器IP在哪個網卡就是哪個
virtual_router_id 71
priority 100
accept
garp_master_refresh 5
garp_master_refresh_repeat 2
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
unicast_src_ip 172.27.124.69 #主服務器的IP 這裏我用的pro01
unicast_peer {
172.27.124.70 #備用服務器IP列表,我這裏只有一個pro02
}
virtual_ipaddress {
172.27.124.71 # 虛擬IP的地址
}
track_script {
chk_pro_server_state
}
}
" > /etc/keepalived/keepalived.conf
# 添加檢測腳本,檢測prometheus進程即可
echo "
#!/bin/bash
count=`ps aux | grep -v grep |grep prometheus| wc -l`
if [ \$count > 0 ]; then
exit 0
else
exit 1
fi
" >/etc/prometheus/check_pro.sh
#添加檢測腳本執行權限
chmod +x /etc/prometheus/check_pro.sh
#開機啓動與自啓
systemctl enable haproxy
systemctl enable keepalived
systemctl start haproxy
systemctl start keepalived
在controller節點安裝各種exporter
# 解壓所有exporter到local下
tar -zxvf /home/pkg/node_exporter-0.18.1.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/collectd_exporter-0.4.0.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/mysqld_exporter-0.12.1.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/haproxy_exporter-0.10.0.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/rabbitmq_exporter-0.29.0.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/memcached_exporter-0.6.0.linux-amd64.tar.gz -C /usr/local/
# 重命名
mv /usr/local/node_exporter-0.18.1.linux-amd64/ /usr/local/node_exporter
mv /usr/local/collectd_exporter-0.4.0.linux-amd64/ /usr/local/collectd_exporter
mv /usr/local/mysqld_exporter-0.12.1.linux-amd64/ /usr/local/mysqld_exporter
mv /usr/local/haproxy_exporter-0.10.0.linux-amd64/ /usr/local/haproxy_exporter
mv /usr/local/rabbitmq_exporter-0.29.0.linux-amd64/ /usr/local/rabbitmq_exporter
mv /usr/local/memcached_exporter-0.6.0.linux-amd64/ /usr/local/memcached_exporter
# 安裝collectd需要的rpm包
cd /home
rpm -hiv /home/pkg/yajl-2.0.4-4.el7.x86_64.rpm
rpm -hiv /home/pkg/collectd-5.8.1-1.el7.x86_64.rpm
rpm -hiv /home/pkg/collectd-virt-5.8.1-1.el7.x86_64.rpm
# 增加prometheus權限
groupadd prometheus
useradd -g prometheus -s /sbin/nologin prometheus
chown -R prometheus:prometheus /usr/local/node_exporter
chown -R prometheus:prometheus /usr/local/collectd_exporter
chown -R prometheus:prometheus /usr/local/mysqld_exporter
chown -R prometheus:prometheus /usr/local/haproxy_exporter/
chown -R prometheus:prometheus /usr/local/rabbitmq_exporter/
chown -R prometheus:prometheus /usr/local/memcached_exporter/
# 向mysql添加guest用戶(沒有mysql?那你裝毛的mysql exporter啊,趕緊裝mysql去)
mysql -uroot -p{你的mysql密碼} -e "grant replication client,process on *.* to guest@'%' identified by 'guest';"
mysql -uroot -p{你的mysql密碼} -e "grant select on performance_schma.* to guest@'%';"
# 添加collectd配置(爲了openstack虛擬機監控)注意中文註釋
cat >> /etc/collectd.conf << EOF
LoadPlugin cpu
LoadPlugin memory
LoadPlugin interface
LoadPlugin write_http
LoadPlugin virt
<Plugin cpu>
ReportByCpu true
ReportByState true
ValuesPercentage true
</Plugin>
<Plugin memory>
ValuesAbsolute true
ValuesPercentage false
</Plugin>
<Plugin interface>
Interface "enp0s3" #openstack public 通信網絡對應物理機網卡的名字
IgnoreSelected false
</Plugin>
<Plugin write_http>
<Node "collectd_exporter">
URL "http://172.27.124.72:9103/collectd-post" #IP填被監控節點controller的
Format "JSON"
StoreRates false
</Node>
</Plugin>
<Plugin virt>
Connection "qemu:///system"
RefreshInterval 10
HostnameFormat name
PluginInstanceFormat name
BlockDevice "/:hd[a-z]/"
IgnoreSelected true
</Plugin>
EOF
# 添加mysql exporter環境變量文件
mkdir -p /etc/mysqld_exporter/conf
cat > /etc/mysqld_exporter/conf/.my.cnf << EOF
[client]
user=guest
password=guest
host=172.27.124.72 #你的mysql地址
port=3306 #端口
EOF
# 添加rabbitmq exporter的環境變量,用戶密碼都需要你在rabbitmq中添加的,不示範了就
cat > /etc/sysconfig/rabbitmq_exporter << EOF
RABBIT_USER="guest"
RABBIT_PASSWORD="guest"
OUTPUT_FORMAT="JSON"
PUBLISH_PORT="9102"
RABBIT_URL="http://172.27.124.72:15672"
EOF
# 註冊node exporter服務
cat > /usr/lib/systemd/system/node_exporter.service << EOF
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/node_exporter/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
# 註冊haproxy exporter服務
cat > /usr/lib/systemd/system/haproxy_exporter.service << EOF
[Unit]
Description=Prometheus HAproxy Exporter
After=network.target
User=prometheus
Group=prometheus
[Service]
Type=simple
Restart=always
ExecStart=/usr/local/haproxy_exporter/haproxy_exporter --haproxy.scrape-uri=http://admin:[email protected]:8080/haproxy?openstack;csv --web.listen-address=:9101
#這裏填的是被監控的haproxy的地址,不是pro節點上那個,
#當然你想監控pro01和pro02那個haproxy也行,在pro節點安裝haproxy exporter吧,並且註冊到#Prometheus服務中
[Install]
WantedBy=multi-user.target
EOF
# 註冊rabbitmq exporter服務
cat > /usr/lib/systemd/system/rabbitmq_exporter.service << EOF
[Unit]
Description=Prometheus RabbitMQ Exporter
After=network.target
User=prometheus
Group=prometheus
[Service]
Type=simple
Restart=always
Environment=
EnvironmentFile=/etc/sysconfig/rabbitmq_exporter
ExecStart=/usr/local/rabbitmq_exporter/rabbitmq_exporter
[Install]
WantedBy=multi-user.target
EOF
# 註冊collectd exporter服務
cat > /usr/lib/systemd/system/collectd_exporter.service << EOF
[Unit]
Description=Collectd_exporter
After=network-online.target
Requires=network-online.target
[Service]
Type=simple
User=prometheus
ExecStart=/bin/bash -l -c /usr/local/collectd_exporter/collectd_exporter --web.listen-address=:9103 --web.collectd-push-path=/collectd-post
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
# 註冊collectd服務
cat > cat /usr/lib/systemd/system/collectd.service << EOF
[Unit]
Description=Collectd statistics daemon
Documentation=man:collectd(1) man:collectd.conf(5)
After=local-fs.target network-online.target
Requires=local-fs.target network-online.target
[Service]
ExecStart=/usr/sbin/collectd
Restart=on-failure
Type=notify
[Install]
WantedBy=multi-user.target
EOF
# 註冊mysql exporter服務
cat > /usr/lib/systemd/system/mysqld_exporter.service << EOF
[Unit]
Description=Prometheus MySQL Exporter
After=network.target
User=prometheus
Group=prometheus
[Service]
Type=simple
Restart=always
ExecStart=/usr/local/mysqld_exporter/mysqld_exporter \
--config.my-cnf=/etc/mysqld_exporter/conf/.my.cnf \
--web.listen-address=:9104 \
--collect.global_status \
--collect.info_schema.innodb_metrics \
--collect.auto_increment.columns \
--collect.info_schema.processlist \
--collect.binlog_size \
--collect.info_schema.tablestats \
--collect.global_variables \
--collect.info_schema.query_response_time \
--collect.info_schema.userstats \
--collect.info_schema.tables \
--collect.slave_status
[Install]
WantedBy=multi-user.target
EOF
# 註冊memcached exporter服務
cat > /usr/lib/systemd/system/memcached_exporter.service << EOF
[Unit]
Description=Prometheus Memcached Exporter
After=network.target
User=prometheus
Group=prometheus
[Service]
Type=simple
Restart=always
ExecStart=/usr/local/memcached_exporter/memcached_exporter --memcached.address=127.0.0.1:11211 --web.listen-address=:9105
[Install]
WantedBy=multi-user.target
EOF
# 自啓與啓動
systemctl daemon-reload
systemctl enable node_exporter
systemctl enable haproxy_exporter
systemctl enable rabbitmq_exporter
systemctl enable collectd
systemctl enable collectd_exporter
systemctl enable mysqld_exporter
systemctl enable memcached_exporter
systemctl start node_exporter
systemctl start haproxy_exporter
systemctl start rabbitmq_exporter
systemctl start collectd
systemctl start collectd_exporter
systemctl start mysqld_exporter
systemctl start memcached_exporter
在m3db01,m3db02,m3db03節點配置m3db需要的內核參數
sysctl -w vm.max_map_count=3000000
sysctl -w vm.swappiness=1
sysctl -n fs.file-max
sysctl -n fs.nr_open
sysctl -w fs.file-max=3000000
sysctl -w fs.nr_open=3000000
ulimit -n 3000000
# 怎麼永久生效不用我說了,自己配。
# 不配也行,就是各種warning太多了受不了。
在m3db01,m3db02,m3db03節點安裝M3DB集羣並啓動
# 解壓安裝包
cd /home
tar -zxvf /home/pkg/m3_0.13.0_linux_amd64.tar.gz
# 重命名
mv /home/m3_0.13.0_linux_amd64 /home/m3db
# 創建配置文件 /home/m3db/m3dbnode.yml ,內容如下
cat >> /home/m3db/m3dbnode.yml << EOF
coordinator:
listenAddress:
type: "config"
value: "0.0.0.0:7201"
local:
namespaces:
- namespace: default #非聚合數據庫名,必填,不填報錯
type: unaggregated
retention: 48h
- namespace: agg #聚合數據庫名,我直接叫agg,你們隨意
type: aggregated
retention: 48h
resolution: 10s
logging:
level: info
metrics:
scope:
prefix: "coordinator"
prometheus:
handlerPath: /metrics
listenAddress: 0.0.0.0:7203 # until https://github.com/m3db/m3/issues/682 is resolved
sanitization: prometheus
samplingRate: 1.0
extended: none
tagOptions:
# Configuration setting for generating metric IDs from tags.
idScheme: quoted
db:
logging:
level: info
metrics:
prometheus:
handlerPath: /metrics
sanitization: prometheus
samplingRate: 1.0
extended: detailed
hostID:
resolver: config
value: m3db01 #這臺機器的機器名
config:
service:
env: default_env
zone: embedded
service: m3db
cacheDir: /var/lib/m3kv
etcdClusters:
- zone: embedded
endpoints:
- 172.27.124.66:2379 #etcd的地址列表,m3db內置了etcd,直接填m3db地址列表
- 172.27.124.67:2379
- 172.27.124.68:2379
seedNodes:
initialCluster:
- hostID: m3db01 #etcd的機器名,咱們裝一起了,所以也叫m3db01,下面IP也一樣
endpoint: http://172.27.124.66:2380
- hostID: m3db02
endpoint: http://172.27.124.67:2380
- hostID: m3db03
endpoint: http://172.27.124.68:2380
listenAddress: 0.0.0.0:9000
clusterListenAddress: 0.0.0.0:9001
httpNodeListenAddress: 0.0.0.0:9002
httpClusterListenAddress: 0.0.0.0:9003
debugListenAddress: 0.0.0.0:9004
client:
writeConsistencyLevel: majority
readConsistencyLevel: unstrict_majority
gcPercentage: 100
writeNewSeriesAsync: true
writeNewSeriesLimitPerSecond: 1048576
writeNewSeriesBackoffDuration: 2ms
bootstrap:
bootstrappers:
- filesystem
- commitlog
- peers
- uninitialized_topology
commitlog:
returnUnfulfilledForCorruptCommitLogFiles: false
cache:
series:
policy: lru
postingsList:
size: 262144
commitlog:
flushMaxBytes: 524288
flushEvery: 1s
queue:
calculationType: fixed
size: 2097152
fs:
filePathPrefix: /var/lib/m3db
EOF
# 啓動m3db,在三臺m3db都配置好以後,就可以啓動了。
# 博主比較懶,直接用nohup啓動,懶得註冊服務了,你們自己弄吧,反正也很簡單。
# 三個節點都執行
nohup /home/m3db/m3dbnode -f /home/m3db/m3dbnode.yml &
在任意一個m3db node節點初始化m3db並且創建namespace
(namespace和數據庫一個意思,就是建庫)
# 安裝curl,不解釋
yum -y install curl
#初始化非聚合歸置組(placement歸置組相當於域,說白了就是描述下集羣長啥樣)
#請無視那一大堆返回值,他證明你執行成功了。
curl -X POST localhost:7201/api/v1/services/m3db/placement/init -d '{
"num_shards": "512",
"replication_factor": "3",
"instances": [
{
"id": "m3db01",
"isolation_group": "us-east1-a",
"zone": "embedded",
"weight": 100,
"endpoint": "172.27.124.66:9000",
"hostname": "172.27.124.66",
"port": 9000
},
{
"id": "m3db02",
"isolation_group": "us-east1-b",
"zone": "embedded",
"weight": 100,
"endpoint": "172.27.124.67:9000",
"hostname": "172.27.124.67",
"port": 9000
},
{
"id": "m3db03",
"isolation_group": "us-east1-c",
"zone": "embedded",
"weight": 100,
"endpoint": "172.27.124.68:9000",
"hostname": "172.27.124.68",
"port": 9000
}
]
}'
#初始化聚合歸置組
curl -X POST localhost:7201/api/v1/services/m3aggregator/placement/init -d '{
"num_shards": "512",
"replication_factor": "3",
"instances": [
{
"id": "m3db01",
"isolation_group": "us-east1-a",
"zone": "embedded",
"weight": 100,
"endpoint": "172.27.124.66:9000",
"hostname": "172.27.124.66",
"port": 9000
},
{
"id": "m3db02",
"isolation_group": "us-east1-b",
"zone": "embedded",
"weight": 100,
"endpoint": "172.27.124.67:9000",
"hostname": "172.27.124.67",
"port": 9000
},
{
"id": "m3db03",
"isolation_group": "us-east1-c",
"zone": "embedded",
"weight": 100,
"endpoint": "172.27.124.68:9000",
"hostname": "172.27.124.68",
"port": 9000
}
]
}'
#創建default非聚合庫
curl -X POST http://localhost:7201/api/v1/database/create -d '{
"type": "cluster",
"namespaceName": "default",
"retentionTime": "48h",
"numShards": "512",
"replicationFactor": "3"
}'
#創建agg聚合庫
curl -X POST http://localhost:7201/api/v1/database/create -d '{
"type": "cluster",
"namespaceName": "agg",
"retentionTime": "48h",
"numShards": "512",
"replicationFactor": "3"
}'
在pro01,pro02節點上安裝m3coordinator,並啓動
# 解壓安裝包
cd /home
tar -zxvf /home/pkg/m3_0.13.0_linux_amd64.tar.gz
# 重命名
mv /home/m3_0.13.0_linux_amd64 /home/m3db
# 創建配置文件 /home/m3db/m3coordinator.yml ,內容如下
cat >> /home/m3db/m3coordinator.yml << EOF
listenAddress:
type: "config"
value: "0.0.0.0:8201"
# m3coordinator和m3node是可以裝一起的,所以這裏端口別寫7201,否則會衝突
logging:
level: info
metrics:
scope:
prefix: "coordinator"
prometheus:
handlerPath: /metrics
listenAddress: 0.0.0.0:8203 # 同理,也別寫7203
sanitization: prometheus
samplingRate: 1.0
extended: none
tagOptions:
idScheme: quoted
clusters:
- namespaces:
- namespace: default
type: unaggregated #非聚合數據庫名,必填,不填報錯
retention: 48h
- namespace: agg
type: aggregated #聚合數據庫名,我直接叫agg,你們隨意
retention: 48h
resolution: 10s
client:
config:
service:
env: default_env
zone: embedded
service: m3db
cacheDir: /var/lib/m3kv
etcdClusters:
- zone: embedded
endpoints: #etcd的地址列表,m3db內置了etcd,直接填m3db地址列表
- 172.27.124.66:2379
- 172.27.124.67:2379
- 172.27.124.68:2379
writeConsistencyLevel: majority
readConsistencyLevel: unstrict_majority
EOF
#懶,直接nohup啓動,別學我,老老實實註冊系統服務去
nohup /home/m3db/m3coordinator -f /home/m3db/m3coordinator.yml &
在pro01,pro02節點上安裝m3query,並啓動
# 創建配置文件 /home/m3db/m3query.yml ,內容如下
cat >> /home/m3db/m3query.yml << EOF
listenAddress:
type: "config"
value: "0.0.0.0:5201"
#m3query和m3node是可以裝一起的,所以這裏端口別寫7201,否則會衝突
logging:
level: info
metrics:
scope:
prefix: "coordinator"
prometheus:
handlerPath: /metrics
listenAddress: 0.0.0.0:5203 # 同理別寫7203
sanitization: prometheus
samplingRate: 1.0
extended: none
tagOptions:
idScheme: quoted
clusters:
- namespaces:
- namespace: default #非聚合數據庫名,必填,不填報錯
type: unaggregated
retention: 48h
- namespace: agg #聚合數據庫名,我直接叫agg,你們隨意,但是與之前的需要保持一致
type: aggregated
retention: 48h
resolution: 10s
client:
config:
service:
env: default_env
zone: embedded
service: m3db
cacheDir: /var/lib/m3kv
etcdClusters:
- zone: embedded
endpoints: #etcd的地址列表,m3db內置了etcd,直接填m3db地址列表
- 172.27.124.66:2379
- 172.27.124.67:2379
- 172.27.124.68:2379
writeConsistencyLevel: majority
readConsistencyLevel: unstrict_majority
writeTimeout: 10s
fetchTimeout: 15s
connectTimeout: 20s
writeRetry:
initialBackoff: 500ms
backoffFactor: 3
maxRetries: 2
jitter: true
fetchRetry:
initialBackoff: 500ms
backoffFactor: 2
maxRetries: 3
jitter: true
backgroundHealthCheckFailLimit: 4
backgroundHealthCheckFailThrottleFactor: 0.5
EOF
#懶,直接nohup啓動,別學我,老老實實註冊系統服務去
nohup /home/m3db/m3query-f /home/m3db/m3query.yml &
配置pro節點數據源,指向m3db集羣
# 其實在pro01和pro02的配置裏追加兩句話就行
echo "
remote_read:
- url: \"http://localhost:5201/api/v1/prom/remote/read\"
# 這是讀地址,填寫m3query的端口
read_recent: true
remote_write:
- url: \"http://localhost:8201/api/v1/prom/remote/write\"
# 這是寫地址,填寫m3coordinator的端口
" >> /usr/local/prometheus/prometheus.yml
# 重啓prometheus服務,耐心多等會兒,m3db需要初始化數據庫好久。
# 你可以在m3coordinator和m3query的日誌中查看是否是否開始遠程讀寫。
# 兩臺pro節點都需要執行
systemctl restart prometheus
驗證是否成功
其實,驗證很簡單,那就是等集羣運行一段時間後從m3query中查詢數據,如果能查到就說明數據確實寫入了m3db集羣,並且通過m3query已經取回了聚合數據。
# 直接在瀏覽器打開網頁
# 時間戳和步長按照自己集羣環境自己調,查詢key也自己調,這只是舉例。
# ip地址是pro01或者pro02的都行,端口要用m3query的,否則不支持prosql
http://172.27.124.69:5201/api/v1/query_range?query=mysql_up&start=1570084589.681&end=1570088189.681&step=14&_=1570087504176
高可用性驗證就自己試吧,關一個m3node看看。再關任意一個prometheus node試試,看看我們的虛擬IP是否正常能獲取監控數據。
我的環境查看http://172.27.124.71:9090/targets就可以了。