使用prometheus operator監控envoy

kubernetes集羣三步安裝

概述

prometheus operator應當是使用監控系統的最佳實踐了,首先它一鍵構建整個監控系統,通過一些無侵入的手段去配置如監控數據源等
故障自動恢復,高可用的告警等。。

不過對於新手使用上還是有一丟丟小門檻,本文就結合如何給envoy做監控這個例子來分享使用prometheus operator的正確姿勢

至於如何寫告警規則,如何配置prometheus查詢語句不是本文探討的重點,會在後續文章中給大家分享,本文着重探討如何使用prometheus operator

prometheus operator安裝

sealyun離線安裝包內已經包含prometheus operator,安裝完直接使用即可

配置監控數據源

原理:通過operator的CRD發現監控數據源service
在這裏插入圖片描述

啓動envoy

apiVersion: apps/v1
kind: Deployment
metadata:
  name: envoy
  labels:
    app: envoy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: envoy
  template:
    metadata:
      labels:
        app: envoy
    spec:
      volumes:
      - hostPath:   # 爲了配置方便把envory配置文件掛載出來了
          path: /root/envoy
          type: DirectoryOrCreate
        name: envoy
      containers:
      - name: envoy
        volumeMounts:
        - mountPath: /etc/envoy
          name: envoy
          readOnly: true
        image: envoyproxy/envoy:latest
        ports:
        - containerPort: 10000 # 數據端口
        - containerPort: 9901  # 管理端口,metric是通過此端口暴露

---
kind: Service
apiVersion: v1
metadata:
  name: envoy
  labels:
    app: envoy  # 給service貼上標籤,operator會去找這個service
spec:
  selector:
    app: envoy
  ports:
  - protocol: TCP
    port: 80
    targetPort: 10000
    name: user
  - protocol: TCP   # service暴露metric的端口
    port: 81
    targetPort: 9901
    name: metrics   # 名字很重要,ServiceMonitor 會找端口名

envoy配置文件:
監聽的地址一定需要修改成0.0.0.0,否則通過service獲取不到metric
/root/envoy/envoy.yaml

admin:
  access_log_path: /tmp/admin_access.log
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0   # 這裏一定要改成0.0.0.0,而不能是127.0.0.1
      port_value: 9901
static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        protocol: TCP
        address: 0.0.0.0
        port_value: 10000
    filter_chains:
    - filters:
      - name: envoy.http_connection_manager
        config:
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  host_rewrite: sealyun.com
                  cluster: service_google
          http_filters:
          - name: envoy.router
  clusters:
  - name: service_sealyun
    connect_timeout: 0.25s
    type: LOGICAL_DNS
    # Comment out the following line to test on v6 networks
    dns_lookup_family: V4_ONLY
    lb_policy: ROUND_ROBIN
    hosts:
      - socket_address:
          address: sealyun.com
          port_value: 443
    tls_context: { sni: sealyun.com }

使用ServiceMonitor

envoyServiceMonitor.yaml:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: envoy
  name: envoy
  namespace: monitoring  # 這個可以與service不在一個namespace中
spec:
  endpoints:
  - interval: 15s
    port: metrics        # envoy service的端口名
    path: /stats/prometheus # 數據源path
  namespaceSelector:
    matchNames:        # envoy service所在namespace
    - default
  selector:
    matchLabels:
      app: envoy       # 選擇envoy service

create成功後我們就可以看到envoy的數據源了:
在這裏插入圖片描述
然後就可以看到metric了:
在這裏插入圖片描述
然後就可以在grafana上進行一些配置了,promethues相關使用不是本文討論的對象

告警配置

alert manager配置

[root@dev-86-201 envoy]# kubectl get secret -n monitoring
NAME                              TYPE                                  DATA   AGE
alertmanager-main                 Opaque                                1      27d

我們可以看到這個secrect,看下里面具體內容:

[root@dev-86-201 envoy]# kubectl get secret  alertmanager-main -o yaml -n monitoring
apiVersion: v1
data:
  alertmanager.yaml: Imdsb2JhbCI6IAogICJyZXNvbHZlX3RpbWVvdXQiOiAiNW0iCiJyZWNlaXZlcnMiOiAKLSAibmFtZSI6ICJudWxsIgoicm91dGUiOiAKICAiZ3JvdXBfYnkiOiAKICAtICJqb2IiCiAgImdyb3VwX2ludGVydmFsIjogIjVtIgogICJncm91cF93YWl0IjogIjMwcyIKICAicmVjZWl2ZXIiOiAibnVsbCIKICAicmVwZWF0X2ludGVydmFsIjogIjEyaCIKICAicm91dGVzIjogCiAgLSAibWF0Y2giOiAKICAgICAgImFsZXJ0bmFtZSI6ICJEZWFkTWFuc1N3aXRjaCIKICAgICJyZWNlaXZlciI6ICJudWxsIg==
kind: Secret

base64解碼一下:

"global":
  "resolve_timeout": "5m"
"receivers":
- "name": "null"
"route":
  "group_by":
  - "job"
  "group_interval": "5m"
  "group_wait": "30s"
  "receiver": "null"
  "repeat_interval": "12h"
  "routes":
  - "match":
      "alertname": "DeadMansSwitch"
    "receiver": "null"

所以配置alertmanager就非常簡單了,就是創建一個secrect即可
如alertmanager.yaml:

global:
  smtp_smarthost: 'smtp.qq.com:465'
  smtp_from: '[email protected]'
  smtp_auth_username: '[email protected]'
  smtp_auth_password: 'xxx'       # 這個密碼是開啓smtp授權後生成的,下文有說怎麼配置
  smtp_require_tls: false
route:
  group_by: ['alertmanager','cluster','service']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 3h
  receiver: 'fanux'
  routes:
  - receiver: 'fanux'
receivers:
- name: 'fanux'
  email_configs:
  - to: '[email protected]'
    send_resolved: true

delete掉老的secret,根據自己的配置重新生成secret即可

kubectl delete secret alertmanager-main -n monitoring
kubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring

郵箱配置,以QQ郵箱爲例

開啓smtp pop3服務
在這裏插入圖片描述
在這裏插入圖片描述
照着操作即可,後面會彈框一個授權碼,配置到上面的配置文件中
然後就可以收到告警了:
在這裏插入圖片描述

告警規則配置

prometheus operator自定義PrometheusRule crd去描述告警規則

[root@dev-86-202 shell]# kubectl get PrometheusRule -n monitoring
NAME                   AGE
prometheus-k8s-rules   6m

直接edit這個rule即可,也可以再自己去創建個PrometheusRule

kubectl edit PrometheusRule prometheus-k8s-rules -n monitoring

如我們在group里加一個告警:

spec:
  groups:
  - name: ./example.rules
    rules:
    - alert: ExampleAlert
      expr: vector(1)
  - name: k8s.rules
    rules:

重啓prometheuspod:

kubectl delete pod prometheus-k8s-0 prometheus-k8s-1 -n monitoring

然後在界面上就可以看到新加的規則:
在這裏插入圖片描述

探討可加QQ羣:98488045

公衆號:

sealyun

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章