prometheus HA高可用 m3db集羣遠程存儲 openstack虛擬機監控

簡介

本文介紹了基於prometheus+keepalived+haproxy+m3db集羣實現的監控高可用方案。

本文將帶大家一步一步的實現基於prometheus的監控高可用集羣（注重實戰，非必要概念不講）

你將得到一個無單點故障，可以監控物理機信息，openstack虛擬機信息，openstack服務，mysql，memcache，rabbitmq等多種信息的高可用監控集羣，並且監控數據爲遠端存儲，具有容災性，可聚合性（多服務監控數據聚合爲一份，數據更加詳細並且多監控點數據完全統一）。

慣例，上架構圖，自己畫的湊合看，pro2節點和pro1節點配置一樣，線太多太亂就沒畫：

架構圖

簡單的解釋一下，理解不了沒關係，動手裝完自然就理解了。

被監控層

這裏指的是被監控節點，當然生產環境肯定不止這一個被監控節點，每個被監控節點上面都跑着各種exporter，其實就是各種服務的數據採集器。

核心監控層

Prometheus server會向各個被監控節點的exporter拉取數據（當然你可以通過exporter向pushgateway來主動推數據）。

haproxy+keepalived相信大家都很熟悉了，就是將多個prometheus server組成多備集羣，使其具有高可用功能。

m3coordinator是專門向M3DB集羣讀寫數據的組件，我們用他來向M3DB集羣寫數據。

m3query是專門用來查詢M3DB數據的，可以查詢聚合數據，並且兼容Prometheus特有的prosql語法，非常方便，我們用它向M3DB集羣讀數據。

數據持久化層

因爲Prometheus本身收集到的數據是存儲在本地tsdb時序數據庫，無法進行大量數據持久化，數據損壞之後也無法恢復，這很明顯無法滿足我們對於高可用的需求，所以博主這裏選擇了M3DB作爲持久化數據庫。最上層就是M3DB集羣，是一個通過etcd來維持集羣狀態的多活數據庫，etcd節點最好爲3，M3DB node節點要求最小爲3，無最大上限要求。

在pro01和pro02安裝Prometheus

在pro01和pro02安裝haproxy和keepalived

在controller節點安裝各種exporter

在m3db01,m3db02,m3db03節點配置m3db需要的內核參數

在m3db01,m3db02,m3db03節點安裝M3DB集羣並啓動

在任意一個m3db node節點初始化m3db並且創建namespace

在pro01，pro02節點上安裝m3coordinator，並啓動

在pro01，pro02節點上安裝m3query，並啓動

配置pro節點數據源，指向m3db集羣

驗證是否成功

實戰開始：

準備工作

centos1708機器 * 6 ：

172.27.124.66 m3db01

172.27.124.67 m3db02

172.27.124.68 m3db03

172.27.124.69 pro01

172.27.124.70 pro02

172.27.124.72 controller

keepalived使用的虛擬IP * 1 ：

172.27.124.71

各種安裝包：（當你看到的時候，很可能一些包已經迭代了多個版本，請自行更新版本，這裏是舉例）

prometheus-2.12.0-rc.0.linux-amd64.tar.gz

下載地址：https://github.com/prometheus/prometheus/releases/tag/v2.12.0

pushgateway-0.9.1.linux-amd64.tar.gz

下載地址：https://github.com/prometheus/pushgateway/releases/tag/v0.9.1

collectd_exporter-0.4.0.linux-amd64.tar.gz

下載地址：https://github.com/prometheus/collectd_exporter/releases/tag/v0.4.0

node_exporter-0.18.1.linux-amd64.tar.gz

下載地址：https://github.com/prometheus/node_exporter/releases/tag/v0.18.1

mysqld_exporter-0.12.1.linux-amd64.tar.gz

下載地址：https://github.com/prometheus/mysqld_exporter/releases/tag/v0.12.1

rabbitmq_exporter-0.29.0.linux-amd64.tar.gz

下載地址：https://github.com/kbudde/rabbitmq_exporter/releases/tag/v0.29.0

openstack-exporter-0.5.0.linux-amd64.tar.gz

下載地址：https://github.com/openstack-exporter/openstack-exporter/releases/tag/v0.5.0

memcached_exporter-0.6.0.linux-amd64.tar.gz

下載地址：https://github.com/prometheus/memcached_exporter/releases/tag/v0.6.0

haproxy_exporter-0.10.0.linux-amd64.tar.gz

下載地址：https://github.com/prometheus/haproxy_exporter/releases/tag/v0.10.0

注意：exporter不止上面這點兒，還有很多，有需求自己去官網找。https://prometheus.io/docs/instrumenting/exporters/

yajl-2.0.4-4.el7.x86_64.rpm

collectd-5.8.1-1.el7.x86_64.rpm

collectd-virt-5.8.1-1.el7.x86_64.rpm

這仨包自己去http://rpmfind.net/linux/rpm2html/search.php上面搜

keepalived和haproxy安裝包就不說了，直接用yum安裝就行。

m3_0.13.0_linux_amd64.tar.gz

下載地址：https://github.com/m3db/m3/releases/tag/v0.13.0

當然如果你懶得下，我已經打包好了：https://download.csdn.net/download/u014706515/11833017

安裝步驟：

將所有安裝包拷貝到所有六臺機器的/home/pkg下。設置所有服務器的hostname，並保證不重名。

所有機器素質3連：關防火牆，關selinux，安裝ntp並同步時間。

首先安裝Prometheus服務和pushgateway服務

在pro01和pro02安裝Prometheus

cd /home/
#解壓安裝包
tar -zxvf /home/pkg/prometheus-2.12.0-rc.0.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/pushgateway-0.9.1.linux-amd64.tar.gz -C /usr/local/

#重命名
mv /usr/local/prometheus-2.12.0-rc.0.linux-amd64/ /usr/local/prometheus
mv /usr/local/pushgateway-0.9.1.linux-amd64/ /usr/local/pushgateway

#添加Prometheus權限
groupadd prometheus
useradd -g prometheus -s /sbin/nologin prometheus
mkdir -p /var/lib/prometheus-data
chown -R prometheus:prometheus /var/lib/prometheus-data
chown -R prometheus:prometheus /usr/local/prometheus/
chown -R prometheus:prometheus /usr/local/pushgateway

#添加Prometheus配置，注意配置中的中文解釋，請根據環境進行變量替換
cat > /usr/local/prometheus/prometheus.yml << EOF
global:
  scrape_interval: 20s
  scrape_timeout: 5s
  evaluation_interval: 10s
scrape_configs:
  - job_name: prometheus
    scrape_interval: 5s
    static_configs:
    - targets: 
      - 172.27.124.69:9090  #這裏填該prometheus服務器的地址，本例是指pro01
      labels:
        instance: prometheus

  - job_name: pushgateway
    scrape_interval: 5s
    static_configs:
      - targets:  
        - 172.27.124.69:9091  #這裏填該prometheus服務器的地址，本例是指pro01
        labels:
          instance: pushgateway

  - job_name: node_exporter
    scrape_interval: 10s
    static_configs:
      - targets: 
        - 172.27.124.72:9100  #這裏填exporter所在被監控服務器的地址，本例是指controller
        labels:
          instance: controller  #被監控服務器的名子，爲了區分監控的哪臺機器，推薦填hostname

  - job_name: haproxy_exporter
    scrape_interval: 5s
    static_configs:
      - targets: 
        - 172.27.124.72:9101  #這裏填exporter所在被監控服務器的地址，本例是指controller
        labels:
          instance: controller  #被監控服務器的名子，爲了區分監控的哪臺機器，推薦填hostname

  - job_name: rabbitmq_exporter
    scrape_interval: 5s
    static_configs:
      - targets: 
        - 172.27.124.72:9102  #這裏填exporter所在被監控服務器的地址，本例是指controller
        labels:
          instance: controller  #被監控服務器的名子，爲了區分監控的哪臺機器，推薦填hostname

  - job_name: collectd_exporter
    scrape_interval: 5s
    static_configs:
      - targets: 
        - 172.27.124.72:9103  #這裏填exporter所在被監控服務器的地址，本例是指controller
        labels:
          instance: controller  #被監控服務器的名子，爲了區分監控的哪臺機器，推薦填hostname

  - job_name: mysqld_exporter
    scrape_interval: 5s
    static_configs:
      - targets: 
        - 172.27.124.72:9104  #這裏填exporter所在被監控服務器的地址，本例是指controller
        labels:
          instance: controller  #被監控服務器的名子，爲了區分監控的哪臺機器，推薦填hostname

  - job_name: memcached_exporter
    scrape_interval: 5s
    static_configs:
      - targets: 
        - 172.27.124.72:9105  #這裏填exporter所在被監控服務器的地址，本例是指controller
        labels:
          instance: controller  #被監控服務器的名子，爲了區分監控的哪臺機器，推薦填hostname
EOF


# 接下來註冊Prometheus爲系統服務，可以使用systemctl
cat > /usr/lib/systemd/system/prometheus.service << EOF
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/
After=network.target

[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus-data
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

# 註冊pushgateway服務
cat > /usr/lib/systemd/system/pushgateway.service << EOF
[Unit]
Description=pushgateway
After=local-fs.target network-online.target
Requires=local-fs.target network-online.target

[Service]
Type=simple
ExecStart=/usr/local/pushgateway/pushgateway
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

#開機自啓與啓動
systemctl daemon-reload
systemctl enable prometheus
systemctl enable pushgateway
systemctl start prometheus
systemctl start pushgateway

在pro01和pro02安裝haproxy和keepalived

# 直接yum安裝，大衆軟件，源生yum源就有
yum -y install haproxy keepalived

# 配置haproxy，注意配置中的中文註釋
echo "
global
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    stats socket /var/lib/haproxy/stats

defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000

listen prometheus-server 
  bind 172.27.124.71:9090     #這個是虛擬VIP的地址，端口可以自選，別衝突就行
  balance roundrobin
  option  tcpka
  option  httpchk
  option  httplog
    server pro01 172.27.124.69:9090  check inter 2000 rise 2 fall 5 #填pro01的地址
    server pro02 172.27.124.70:9090  check inter 2000 rise 2 fall 5 #填pro02的地址
" > /etc/haproxy/haproxy.cfg


# 配置keepalived，注意配置中的中文註釋
echo "
vrrp_script chk_pro_server_state {
    script \"/etc/prometheus/check_pro.sh\"   # 檢測腳本的路徑。腳本在後面介紹
    interval 5
    fall 2
    rise 2
}

vrrp_instance haproxy_vip {
  state MASTER
  interface enp0s3   #代理VIP的網卡名，不懂的用'ip a'指令查看，你機器IP在哪個網卡就是哪個
  virtual_router_id 71
  priority 100
  accept
  garp_master_refresh 5
  garp_master_refresh_repeat 2
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass 123456
  }

  unicast_src_ip 172.27.124.69    #主服務器的IP 這裏我用的pro01
  unicast_peer {
    172.27.124.70    #備用服務器IP列表，我這裏只有一個pro02
  }

  virtual_ipaddress {
    172.27.124.71   # 虛擬IP的地址
  }

  track_script {
    chk_pro_server_state
  }
}
" > /etc/keepalived/keepalived.conf

# 添加檢測腳本，檢測prometheus進程即可
echo "
#!/bin/bash
count=`ps aux | grep -v grep |grep prometheus| wc -l`
if [ \$count > 0 ]; then
    exit 0
else
    exit 1
fi
" >/etc/prometheus/check_pro.sh

#添加檢測腳本執行權限
chmod +x /etc/prometheus/check_pro.sh

#開機啓動與自啓
systemctl enable haproxy
systemctl enable keepalived
systemctl start haproxy
systemctl start keepalived

在controller節點安裝各種exporter

# 解壓所有exporter到local下
tar -zxvf /home/pkg/node_exporter-0.18.1.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/collectd_exporter-0.4.0.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/mysqld_exporter-0.12.1.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/haproxy_exporter-0.10.0.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/rabbitmq_exporter-0.29.0.linux-amd64.tar.gz -C /usr/local/
tar -zxvf /home/pkg/memcached_exporter-0.6.0.linux-amd64.tar.gz -C /usr/local/

# 重命名
mv /usr/local/node_exporter-0.18.1.linux-amd64/ /usr/local/node_exporter
mv /usr/local/collectd_exporter-0.4.0.linux-amd64/ /usr/local/collectd_exporter
mv /usr/local/mysqld_exporter-0.12.1.linux-amd64/ /usr/local/mysqld_exporter
mv /usr/local/haproxy_exporter-0.10.0.linux-amd64/ /usr/local/haproxy_exporter
mv /usr/local/rabbitmq_exporter-0.29.0.linux-amd64/ /usr/local/rabbitmq_exporter
mv /usr/local/memcached_exporter-0.6.0.linux-amd64/ /usr/local/memcached_exporter


# 安裝collectd需要的rpm包
cd /home
rpm -hiv /home/pkg/yajl-2.0.4-4.el7.x86_64.rpm
rpm -hiv /home/pkg/collectd-5.8.1-1.el7.x86_64.rpm
rpm -hiv /home/pkg/collectd-virt-5.8.1-1.el7.x86_64.rpm

# 增加prometheus權限
groupadd prometheus
useradd -g prometheus -s /sbin/nologin prometheus
chown -R prometheus:prometheus /usr/local/node_exporter
chown -R prometheus:prometheus /usr/local/collectd_exporter
chown -R prometheus:prometheus /usr/local/mysqld_exporter
chown -R prometheus:prometheus /usr/local/haproxy_exporter/
chown -R prometheus:prometheus /usr/local/rabbitmq_exporter/
chown -R prometheus:prometheus /usr/local/memcached_exporter/


# 向mysql添加guest用戶（沒有mysql？那你裝毛的mysql exporter啊，趕緊裝mysql去）
mysql -uroot -p{你的mysql密碼} -e "grant replication client,process on *.* to guest@'%' identified by 'guest';"
mysql -uroot -p{你的mysql密碼} -e "grant select on performance_schma.* to guest@'%';"

# 添加collectd配置（爲了openstack虛擬機監控）注意中文註釋
cat >> /etc/collectd.conf << EOF
LoadPlugin cpu
LoadPlugin memory
LoadPlugin interface
LoadPlugin write_http
LoadPlugin virt
<Plugin cpu>
  ReportByCpu true
  ReportByState true
  ValuesPercentage true
</Plugin>
<Plugin memory>
  ValuesAbsolute true
  ValuesPercentage false
</Plugin>
<Plugin interface>
  Interface "enp0s3"   #openstack public 通信網絡對應物理機網卡的名字
  IgnoreSelected false
</Plugin>
<Plugin write_http>
  <Node "collectd_exporter">
    URL "http://172.27.124.72:9103/collectd-post" #IP填被監控節點controller的
    Format "JSON"
    StoreRates false
  </Node>
</Plugin>
<Plugin virt>
  Connection "qemu:///system"
  RefreshInterval 10
  HostnameFormat name
  PluginInstanceFormat name
  BlockDevice "/:hd[a-z]/"
  IgnoreSelected true
</Plugin>
EOF

# 添加mysql exporter環境變量文件
mkdir -p /etc/mysqld_exporter/conf
cat > /etc/mysqld_exporter/conf/.my.cnf << EOF
[client]
user=guest
password=guest
host=172.27.124.72  #你的mysql地址
port=3306  #端口
EOF

# 添加rabbitmq exporter的環境變量，用戶密碼都需要你在rabbitmq中添加的，不示範了就
cat > /etc/sysconfig/rabbitmq_exporter << EOF
RABBIT_USER="guest"
RABBIT_PASSWORD="guest"
OUTPUT_FORMAT="JSON"
PUBLISH_PORT="9102"
RABBIT_URL="http://172.27.124.72:15672"
EOF

# 註冊node exporter服務
cat > /usr/lib/systemd/system/node_exporter.service << EOF
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target

[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/node_exporter/node_exporter
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

# 註冊haproxy exporter服務
cat > /usr/lib/systemd/system/haproxy_exporter.service << EOF
[Unit]
Description=Prometheus HAproxy Exporter
After=network.target
User=prometheus
Group=prometheus

[Service]
Type=simple
Restart=always
ExecStart=/usr/local/haproxy_exporter/haproxy_exporter --haproxy.scrape-uri=http://admin:[email protected]:8080/haproxy?openstack;csv --web.listen-address=:9101    
#這裏填的是被監控的haproxy的地址，不是pro節點上那個，
#當然你想監控pro01和pro02那個haproxy也行，在pro節點安裝haproxy exporter吧，並且註冊到#Prometheus服務中
[Install]
WantedBy=multi-user.target
EOF

# 註冊rabbitmq exporter服務
cat > /usr/lib/systemd/system/rabbitmq_exporter.service << EOF
[Unit]
Description=Prometheus RabbitMQ Exporter
After=network.target
User=prometheus
Group=prometheus

[Service]
Type=simple
Restart=always
Environment=
EnvironmentFile=/etc/sysconfig/rabbitmq_exporter 
ExecStart=/usr/local/rabbitmq_exporter/rabbitmq_exporter

[Install]
WantedBy=multi-user.target
EOF

# 註冊collectd exporter服務
cat > /usr/lib/systemd/system/collectd_exporter.service << EOF
[Unit]
Description=Collectd_exporter
After=network-online.target
Requires=network-online.target

[Service]
Type=simple
User=prometheus
ExecStart=/bin/bash -l -c /usr/local/collectd_exporter/collectd_exporter --web.listen-address=:9103 --web.collectd-push-path=/collectd-post
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF


# 註冊collectd服務
cat > cat /usr/lib/systemd/system/collectd.service << EOF
[Unit]
Description=Collectd statistics daemon
Documentation=man:collectd(1) man:collectd.conf(5)
After=local-fs.target network-online.target
Requires=local-fs.target network-online.target

[Service]
ExecStart=/usr/sbin/collectd
Restart=on-failure
Type=notify

[Install]
WantedBy=multi-user.target
EOF

# 註冊mysql exporter服務
cat > /usr/lib/systemd/system/mysqld_exporter.service << EOF
[Unit]
Description=Prometheus MySQL Exporter
After=network.target
User=prometheus
Group=prometheus

[Service]
Type=simple
Restart=always
ExecStart=/usr/local/mysqld_exporter/mysqld_exporter \
--config.my-cnf=/etc/mysqld_exporter/conf/.my.cnf \
--web.listen-address=:9104 \
--collect.global_status \
--collect.info_schema.innodb_metrics \
--collect.auto_increment.columns \
--collect.info_schema.processlist \
--collect.binlog_size \
--collect.info_schema.tablestats \
--collect.global_variables \
--collect.info_schema.query_response_time \
--collect.info_schema.userstats \
--collect.info_schema.tables \
--collect.slave_status 

[Install]
WantedBy=multi-user.target
EOF

# 註冊memcached exporter服務
cat > /usr/lib/systemd/system/memcached_exporter.service << EOF
[Unit]
Description=Prometheus Memcached Exporter
After=network.target
User=prometheus
Group=prometheus

[Service]
Type=simple
Restart=always
ExecStart=/usr/local/memcached_exporter/memcached_exporter --memcached.address=127.0.0.1:11211 --web.listen-address=:9105

[Install]
WantedBy=multi-user.target
EOF

# 自啓與啓動
systemctl daemon-reload
systemctl enable node_exporter
systemctl enable haproxy_exporter
systemctl enable rabbitmq_exporter
systemctl enable collectd
systemctl enable collectd_exporter
systemctl enable mysqld_exporter
systemctl enable memcached_exporter

systemctl start node_exporter
systemctl start haproxy_exporter
systemctl start rabbitmq_exporter
systemctl start collectd
systemctl start collectd_exporter
systemctl start mysqld_exporter
systemctl start memcached_exporter

在m3db01,m3db02,m3db03節點配置m3db需要的內核參數

sysctl -w vm.max_map_count=3000000
sysctl -w vm.swappiness=1
sysctl -n fs.file-max
sysctl -n fs.nr_open
sysctl -w fs.file-max=3000000
sysctl -w fs.nr_open=3000000
ulimit -n 3000000

# 怎麼永久生效不用我說了，自己配。
# 不配也行，就是各種warning太多了受不了。

在m3db01,m3db02,m3db03節點安裝M3DB集羣並啓動

# 解壓安裝包
cd /home   
tar -zxvf /home/pkg/m3_0.13.0_linux_amd64.tar.gz

# 重命名
mv /home/m3_0.13.0_linux_amd64 /home/m3db

# 創建配置文件 /home/m3db/m3dbnode.yml ,內容如下
cat >> /home/m3db/m3dbnode.yml << EOF
coordinator:
  listenAddress:
    type: "config"
    value: "0.0.0.0:7201"

  local:
    namespaces:
      - namespace: default  #非聚合數據庫名，必填，不填報錯
        type: unaggregated
        retention: 48h
      - namespace: agg    #聚合數據庫名，我直接叫agg，你們隨意
        type: aggregated
        retention: 48h
        resolution: 10s

  logging:
    level: info

  metrics:
    scope:
      prefix: "coordinator"
    prometheus:
      handlerPath: /metrics
      listenAddress: 0.0.0.0:7203 # until https://github.com/m3db/m3/issues/682 is resolved
    sanitization: prometheus
    samplingRate: 1.0
    extended: none

  tagOptions:
    # Configuration setting for generating metric IDs from tags.
    idScheme: quoted

db:
  logging:
    level: info

  metrics:
    prometheus:
      handlerPath: /metrics
    sanitization: prometheus
    samplingRate: 1.0
    extended: detailed

  hostID:
    resolver: config
    value: m3db01       #這臺機器的機器名


  config:
    service:
      env: default_env
      zone: embedded
      service: m3db
      cacheDir: /var/lib/m3kv
      etcdClusters:
        - zone: embedded
          endpoints:
            - 172.27.124.66:2379        #etcd的地址列表，m3db內置了etcd，直接填m3db地址列表
            - 172.27.124.67:2379
            - 172.27.124.68:2379
    seedNodes:
      initialCluster:
        - hostID: m3db01        #etcd的機器名，咱們裝一起了，所以也叫m3db01，下面IP也一樣
          endpoint: http://172.27.124.66:2380
        - hostID: m3db02
          endpoint: http://172.27.124.67:2380
        - hostID: m3db03
          endpoint: http://172.27.124.68:2380

  listenAddress: 0.0.0.0:9000
  clusterListenAddress: 0.0.0.0:9001
  httpNodeListenAddress: 0.0.0.0:9002
  httpClusterListenAddress: 0.0.0.0:9003
  debugListenAddress: 0.0.0.0:9004

  client:
    writeConsistencyLevel: majority
    readConsistencyLevel: unstrict_majority

  gcPercentage: 100

  writeNewSeriesAsync: true
  writeNewSeriesLimitPerSecond: 1048576
  writeNewSeriesBackoffDuration: 2ms

  bootstrap:
    bootstrappers:
        - filesystem
        - commitlog
        - peers
        - uninitialized_topology
    commitlog:
      returnUnfulfilledForCorruptCommitLogFiles: false

  cache:
    series:
      policy: lru
    postingsList:
      size: 262144

  commitlog:
    flushMaxBytes: 524288
    flushEvery: 1s
    queue:
      calculationType: fixed
      size: 2097152

  fs:
    filePathPrefix: /var/lib/m3db
EOF

# 啓動m3db,在三臺m3db都配置好以後，就可以啓動了。
# 博主比較懶，直接用nohup啓動，懶得註冊服務了，你們自己弄吧，反正也很簡單。
# 三個節點都執行
nohup /home/m3db/m3dbnode -f /home/m3db/m3dbnode.yml &

在任意一個m3db node節點初始化m3db並且創建namespace

（namespace和數據庫一個意思，就是建庫）

# 安裝curl，不解釋
yum -y install curl 

#初始化非聚合歸置組(placement歸置組相當於域，說白了就是描述下集羣長啥樣)
#請無視那一大堆返回值，他證明你執行成功了。
curl -X POST localhost:7201/api/v1/services/m3db/placement/init -d '{
    "num_shards": "512",
    "replication_factor": "3",
    "instances": [
        {
            "id": "m3db01",
            "isolation_group": "us-east1-a",
            "zone": "embedded",
            "weight": 100,
            "endpoint": "172.27.124.66:9000",
            "hostname": "172.27.124.66",
            "port": 9000
        },
        {
            "id": "m3db02",
            "isolation_group": "us-east1-b",
            "zone": "embedded",
            "weight": 100,
            "endpoint": "172.27.124.67:9000",
            "hostname": "172.27.124.67",
            "port": 9000
        },
        {
            "id": "m3db03",
            "isolation_group": "us-east1-c",
            "zone": "embedded",
            "weight": 100,
            "endpoint": "172.27.124.68:9000",
            "hostname": "172.27.124.68",
            "port": 9000
        }
    ]
}'

#初始化聚合歸置組
curl -X POST localhost:7201/api/v1/services/m3aggregator/placement/init -d '{
    "num_shards": "512",
    "replication_factor": "3",
    "instances": [
        {
            "id": "m3db01",
            "isolation_group": "us-east1-a",
            "zone": "embedded",
            "weight": 100,
            "endpoint": "172.27.124.66:9000",
            "hostname": "172.27.124.66",
            "port": 9000
        },
        {
            "id": "m3db02",
            "isolation_group": "us-east1-b",
            "zone": "embedded",
            "weight": 100,
            "endpoint": "172.27.124.67:9000",
            "hostname": "172.27.124.67",
            "port": 9000
        },
        {
            "id": "m3db03",
            "isolation_group": "us-east1-c",
            "zone": "embedded",
            "weight": 100,
            "endpoint": "172.27.124.68:9000",
            "hostname": "172.27.124.68",
            "port": 9000
        }
    ]
}'



#創建default非聚合庫
curl -X POST http://localhost:7201/api/v1/database/create -d '{
  "type": "cluster",
  "namespaceName": "default",
  "retentionTime": "48h",
  "numShards": "512",
  "replicationFactor": "3"
}'

#創建agg聚合庫
curl -X POST http://localhost:7201/api/v1/database/create -d '{
  "type": "cluster",
  "namespaceName": "agg",
  "retentionTime": "48h",
  "numShards": "512",
  "replicationFactor": "3"
}'

在pro01，pro02節點上安裝m3coordinator，並啓動

# 解壓安裝包
cd /home   
tar -zxvf /home/pkg/m3_0.13.0_linux_amd64.tar.gz

# 重命名
mv /home/m3_0.13.0_linux_amd64 /home/m3db

# 創建配置文件 /home/m3db/m3coordinator.yml ,內容如下
cat >> /home/m3db/m3coordinator.yml << EOF
listenAddress:
  type: "config"
  value: "0.0.0.0:8201"     
# m3coordinator和m3node是可以裝一起的，所以這裏端口別寫7201，否則會衝突

logging:
  level: info

metrics:
  scope:
    prefix: "coordinator"
  prometheus:
    handlerPath: /metrics
    listenAddress: 0.0.0.0:8203 # 同理，也別寫7203
  sanitization: prometheus
  samplingRate: 1.0
  extended: none

tagOptions:
  idScheme: quoted

clusters:
   - namespaces:
      - namespace: default
        type: unaggregated #非聚合數據庫名，必填，不填報錯
        retention: 48h
      - namespace: agg
        type: aggregated   #聚合數據庫名，我直接叫agg，你們隨意
        retention: 48h
        resolution: 10s
     client:
       config:
         service:
           env: default_env
           zone: embedded
           service: m3db
           cacheDir: /var/lib/m3kv
           etcdClusters:
             - zone: embedded
               endpoints:      #etcd的地址列表，m3db內置了etcd，直接填m3db地址列表
                 - 172.27.124.66:2379
                 - 172.27.124.67:2379
                 - 172.27.124.68:2379
       writeConsistencyLevel: majority
       readConsistencyLevel: unstrict_majority
EOF

#懶，直接nohup啓動，別學我，老老實實註冊系統服務去

nohup /home/m3db/m3coordinator -f /home/m3db/m3coordinator.yml &

在pro01，pro02節點上安裝m3query，並啓動

# 創建配置文件 /home/m3db/m3query.yml ,內容如下
cat >> /home/m3db/m3query.yml << EOF
listenAddress:
  type: "config"
  value: "0.0.0.0:5201" 
#m3query和m3node是可以裝一起的，所以這裏端口別寫7201，否則會衝突

logging:
  level: info

metrics:
  scope:
    prefix: "coordinator"
  prometheus:
    handlerPath: /metrics
    listenAddress: 0.0.0.0:5203 # 同理別寫7203
  sanitization: prometheus
  samplingRate: 1.0
  extended: none

tagOptions:
  idScheme: quoted

clusters:
  - namespaces:
      - namespace: default   #非聚合數據庫名，必填，不填報錯
        type: unaggregated 
        retention: 48h
      - namespace: agg       #聚合數據庫名，我直接叫agg，你們隨意，但是與之前的需要保持一致
        type: aggregated
        retention: 48h
        resolution: 10s
    client:
      config:
        service:
          env: default_env
          zone: embedded
          service: m3db
          cacheDir: /var/lib/m3kv
          etcdClusters:
            - zone: embedded
              endpoints:       #etcd的地址列表，m3db內置了etcd，直接填m3db地址列表
                - 172.27.124.66:2379
                - 172.27.124.67:2379
                - 172.27.124.68:2379
      writeConsistencyLevel: majority
      readConsistencyLevel: unstrict_majority
      writeTimeout: 10s
      fetchTimeout: 15s
      connectTimeout: 20s
      writeRetry:
        initialBackoff: 500ms
        backoffFactor: 3
        maxRetries: 2
        jitter: true
      fetchRetry:
        initialBackoff: 500ms
        backoffFactor: 2
        maxRetries: 3
        jitter: true
      backgroundHealthCheckFailLimit: 4
      backgroundHealthCheckFailThrottleFactor: 0.5
EOF

#懶，直接nohup啓動，別學我，老老實實註冊系統服務去

nohup /home/m3db/m3query-f /home/m3db/m3query.yml &

配置pro節點數據源，指向m3db集羣

# 其實在pro01和pro02的配置裏追加兩句話就行
echo "
remote_read:
  - url: \"http://localhost:5201/api/v1/prom/remote/read\"
    # 這是讀地址，填寫m3query的端口
    read_recent: true
remote_write:
  - url: \"http://localhost:8201/api/v1/prom/remote/write\"
    # 這是寫地址，填寫m3coordinator的端口
" >> /usr/local/prometheus/prometheus.yml

# 重啓prometheus服務，耐心多等會兒，m3db需要初始化數據庫好久。
# 你可以在m3coordinator和m3query的日誌中查看是否是否開始遠程讀寫。
# 兩臺pro節點都需要執行
systemctl restart prometheus

驗證是否成功

其實，驗證很簡單，那就是等集羣運行一段時間後從m3query中查詢數據，如果能查到就說明數據確實寫入了m3db集羣，並且通過m3query已經取回了聚合數據。

# 直接在瀏覽器打開網頁
# 時間戳和步長按照自己集羣環境自己調，查詢key也自己調，這只是舉例。
# ip地址是pro01或者pro02的都行，端口要用m3query的，否則不支持prosql
http://172.27.124.69:5201/api/v1/query_range?query=mysql_up&start=1570084589.681&end=1570088189.681&step=14&_=1570087504176

高可用性驗證就自己試吧，關一個m3node看看。再關任意一個prometheus node試試，看看我們的虛擬IP是否正常能獲取監控數據。

我的環境查看http://172.27.124.71:9090/targets就可以了。

prometheus HA高可用 m3db集羣遠程存儲 openstack虛擬機監控

簡介

架構圖

被監控層

核心監控層

數據持久化層

準備工作

安裝步驟：

在pro01和pro02安裝Prometheus

在pro01和pro02安裝haproxy和keepalived

在controller節點安裝各種exporter

在m3db01,m3db02,m3db03節點配置m3db需要的內核參數

在m3db01,m3db02,m3db03節點安裝M3DB集羣並啓動

在任意一個m3db node節點初始化m3db並且創建namespace

在pro01，pro02節點上安裝m3coordinator，並啓動

在pro01，pro02節點上安裝m3query，並啓動

配置pro節點數據源，指向m3db集羣

驗證是否成功

[轉帖]使用NMT和pmap解決JVM資源泄漏問題原創

Python實現大麥網搶票的四大關鍵技術點解析

Python 安裝庫指令大全

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

基於 Milvus + LlamaIndex 實現高級 RAG

【2024-05-21】以茶會友

prometheus HA高可用 m3db集羣遠程存儲 openstack虛擬機監控

swift對接整合ceph

如何跟蹤調試openstack源碼

docker製作鏡像，導出導入本地鏡像等初級指南

Ceph 常用操作指令

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結