Prometheus Operatorde的安裝部署見Helm部署Prometheus Operator和自定義監控。
Ambassador Edge Stack的安裝部署見Ambassador系列-11-Helm安裝Ambassador Edge Stack 1.1.0。
Ambassador安裝後的的概況。
kubectl get all -nambassador
NAME READY STATUS RESTARTS AGE
pod/ambassador-75b5688649-9rsdf 1/1 Running 2 13h
pod/ambassador-75b5688649-w489t 1/1 Running 2 13h
pod/ambassador-75b5688649-zmh7k 1/1 Running 2 13h
pod/ambassador-redis-8556cbb4c6-7kqsg 1/1 Running 0 13h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ambassador NodePort 10.1.165.172 <none> 80:17555/TCP,443:38895/TCP 13h
service/ambassador-admin ClusterIP 10.1.104.26 <none> 8877/TCP 13h
service/ambassador-redis ClusterIP 10.1.46.44 <none> 6379/TCP 13h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ambassador 3/3 3 3 13h
deployment.apps/ambassador-redis 1/1 1 1 13h
NAME DESIRED CURRENT READY AGE
replicaset.apps/ambassador-75b5688649 3 3 3 13h
replicaset.apps/ambassador-redis-8556cbb4c6 1 1 1 13h
通過管理Service端口訪問metrics。
curl http://10.1.166.205:8877/metrics
# TYPE envoy_cluster_upstream_cx_connect_timeout counter
envoy_cluster_upstream_cx_connect_timeout{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_upstream_flow_control_paused_reading_total counter
envoy_cluster_upstream_flow_control_paused_reading_total{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_upstream_cx_close_notify counter
envoy_cluster_upstream_cx_close_notify{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_lb_recalculate_zone_structures counter
envoy_cluster_lb_recalculate_zone_structures{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_upstream_flow_control_resumed_reading_total counter
envoy_cluster_upstream_flow_control_resumed_reading_total{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_upstream_rq_timeout counter
envoy_cluster_upstream_rq_timeout{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_upstream_cx_connect_fail counter
envoy_cluster_upstream_cx_connect_fail{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_upstream_rq_cancelled counter
envoy_cluster_upstream_rq_cancelled{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_upstream_cx_rx_bytes_total counter
envoy_cluster_upstream_cx_rx_bytes_total{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_upstream_cx_overflow counter
envoy_cluster_upstream_cx_overflow{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_upstream_cx_destroy_remote counter
envoy_cluster_upstream_cx_destroy_remote{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
# TYPE envoy_cluster_upstream_cx_http2_total counter
envoy_cluster_upstream_cx_http2_total{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
......
爲Prometheus、Grafana和Alertmanager配置Mapping。
vi prometheus-mapping.yaml
---
apiVersion: getambassador.io/v2
kind: Mapping
metadata:
name: prometheus-mapping
namespace: ambassador
spec:
host: prom.twingao.com:38895
prefix: /
service: prometheus-prometheus-oper-prometheus.monitoring:9090
---
apiVersion: getambassador.io/v2
kind: Mapping
metadata:
name: grafana-mapping
namespace: ambassador
spec:
host: grafana.twingao.com:38895
prefix: /
service: prometheus-grafana.monitoring:80
---
apiVersion: getambassador.io/v2
kind: Mapping
metadata:
name: alert-mapping
namespace: ambassador
spec:
host: alert.twingao.com:38895
prefix: /
service: prometheus-prometheus-oper-alertmanager.monitoring:9093
kubectl apply -f prometheus-mapping.yaml
在瀏覽器主機增加hosts,Windows系統在C:\Windows\System32\drivers\etc\hosts
。
# Prometheus Start
192.168.1.55 prom.twingao.com
192.168.1.55 grafana.twingao.com
192.168.1.55 alert.twingao.com
# Prometheus End
訪問Prometheus,地址https://prom.twingao.com:38895/graph。
訪問Alertmanager,地址:https://alert.twingao.com:38895/#/alerts。
訪問Grafana,地址https://grafana.twingao.com:38895/,缺省密碼爲prom-operator
,獲取方式:
helm show values stable/prometheus-operator | grep adminPassword
adminPassword: prom-operator
Grafana缺省內置了多個dashboard。
Grafana能夠自動探測到Pod,並對Pod從CPU、內存、網絡多個方面進行監控。
Grafana能夠對Kubernetes的各個Node節點從CPU、內存、負載、磁盤I/O、磁盤使用率和網絡多個方面進行監控。
我們從service/ambassador-admin抓取metrics,查看一下ports。
kubectl get service/ambassador-admin -oyaml -nambassador
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2020-04-01T13:34:28Z"
labels:
app.kubernetes.io/instance: ambassador
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: ambassador
app.kubernetes.io/part-of: ambassador
helm.sh/chart: ambassador-6.2.2
product: aes
service: ambassador-admin
name: ambassador-admin
namespace: ambassador
resourceVersion: "74460"
selfLink: /api/v1/namespaces/ambassador/services/ambassador-admin
uid: 85a6c036-0e5e-4abb-bfd8-70c8956d305e
spec:
clusterIP: 10.1.104.26
ports:
- name: ambassador-admin
port: 8877
protocol: TCP
targetPort: admin
selector:
app.kubernetes.io/instance: ambassador
app.kubernetes.io/name: ambassador
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
查看Prometheus自定義資源,其中定義了serviceMonitorSelector.matchLabels=release: prometheus
,Prometheus據此關聯ServiceMonitor。
kubectl get prometheuses.monitoring.coreos.com/prometheus-prometheus-oper-prometheus -nmonitoring -oyaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
creationTimestamp: "2020-04-01T01:43:35Z"
generation: 1
labels:
app: prometheus-operator-prometheus
chart: prometheus-operator-8.5.0
heritage: Helm
release: prometheus
name: prometheus-prometheus-oper-prometheus
namespace: monitoring
resourceVersion: "14492"
selfLink: /apis/monitoring.coreos.com/v1/namespaces/monitoring/prometheuses/prometheus-prometheus-oper-prometheus
uid: 2b97c71a-5ec6-41bf-a42a-565136821ae5
spec:
alerting:
alertmanagers:
- name: prometheus-prometheus-oper-alertmanager
namespace: monitoring
pathPrefix: /
port: web
baseImage: quay.io/prometheus/prometheus
enableAdminAPI: false
externalUrl: http://prometheus-prometheus-oper-prometheus.monitoring:9090
listenLocal: false
logFormat: logfmt
logLevel: info
paused: false
podMonitorNamespaceSelector: {}
podMonitorSelector:
matchLabels:
release: prometheus
portName: web
replicas: 1
retention: 10d
routePrefix: /
ruleNamespaceSelector: {}
ruleSelector:
matchLabels:
app: prometheus-operator
release: prometheus
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-prometheus-oper-prometheus
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
release: prometheus
version: v2.13.1
創建ServiceMonitor。其中幾個需要注意的關鍵點。
- ServiceMonitor的name最終會反應到Prometheus的配置中,作爲job_name。
- 由於Prometheus自定義資源中定義了
serviceMonitorSelector.matchLabels=release: prometheus
,表示ServiceMonitor需要定義一個標籤release: prometheus,Prometheus據此可以關聯ServiceMonitor。 - ServiceMonitor的命名空間必須和Prometheus所在的命名空間相同,此處爲monitoring。
- endpoints.port需要和Service中的拉取metrics的ports.name對應,此處和上面對應爲ambassador-admin。
- namespaceSelector.matchNames需要和被監控的Service所在的命名空間相同,此處爲ambassador。
- selector.matchLabels的標籤必須和被監控的Service中能唯一標明身份的標籤對應。
創建ambassador-admin服務對應的ServiceMonitor。
vi prometheus-serviceMonitorAmbassador.yaml
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: ambassador-monitor
labels:
release: prometheus
namespace: monitoring
spec:
endpoints:
- port: ambassador-admin
namespaceSelector:
matchNames:
- ambassador
selector:
matchLabels:
service: ambassador-admin
kubectl apply -f prometheus-serviceMonitorAmbassador.yaml
Prometheus的Targets。
Prometheus監控指標每分鐘的請求數rate(envoy_http_rq_total[1m])
。
可以從Grafana的Dashboard模板https://grafana.com/grafana/dashboards中下載Ambassador相關Dashboard。可以下載第一個模板。
在Grafana將該Dashboard上傳上去,能夠從多個方面監控。
由於Ambassador是基於envoy開發的,其metrics也是從envoy直接拉取的。我們也可以使用envoy的Dashboard。我們可以參考這些Dashboard模板,然後結合metrics,開發適合自己的Dashboard。