Jaeger分佈式跟蹤工具初探

官方文檔

Jaegertracing

Jaeger簡介

Jaeger:開源的端到端分佈式跟蹤,監視複雜的分佈式系統中的事務並進行故障排除。
下圖對比了常用的開源全鏈路追蹤方案,目前SkyWalking和Pinpoint使用比較多,Jaeger相比客戶端支持語言比較多,特別是對C++的支持,所以這次選擇測試下。


Jaeger解決的問題

  • 分佈式事務監控
  • 性能和延遲優化
  • 根本原因分析
  • 服務依賴性分析
  • 分佈式上下文傳播

Jaeger架構圖

Jaeger組件

  • Jaeger Agent,負責和客戶端通信,把收集到的追蹤信息上報個收集器 Jaeger Collector
  • Jaeger Colletor把收集到的數據存入數據庫或者其它存儲器
  • Jaeger Query 負責對追蹤數據進行查詢
  • Jaeger Ingester 是一個從Kafka主題讀取並寫入另一個存儲後端(Cassandra、Elasticsearch)的服務
  • Jaeger UI負責用戶交互

Jaeger端口統計

Agent
5775 UDP協議,接收兼容zipkin的協議數據
6831 UDP協議,接收兼容jaeger的兼容協議
6832 UDP協議,接收jaeger的二進制協議
5778 HTTP協議,數據量大不建議使用

Collector
14267 tcp agent發送jaeger.thrift格式數據
14250 tcp agent發送proto格式數據(背後gRPC)
14268 http 直接接受客戶端數據
14269 http 健康檢查

Query
16686 http jaeger的前端,放給用戶的接口
16687 http 健康檢查

Jaeger部署

1.創建命名空間

[root@VM-0-123-centos jaeger]# kubectl create namespace jaeger 

2.部署Jaeger-Operator
Jaeger Operator:Jaeger Operator for Kubernetes簡化了在Kubernetes上的部署和運行Jaeger。
Jaeger Operator是Kubernetes operator的實現。操作員是一種軟件,可以減輕運行另一軟件的操作複雜性。從技術上講,操作員是打包,部署和管理Kubernetes應用程序的一種方法。
Jaeger Operator版本跟蹤Jaeger組件(查詢,收集器,代理)的一種版本。發行新版本的Jaeger組件時,將發行新版本的操作員,該操作員瞭解如何將先前版本的運行實例升級到新版本。

[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/crds/jaegertracing.io_jaegers_crd.yaml 
[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/service_account.yaml
[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role.yaml
[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role_binding.yaml
[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/operator.yaml

查看狀態

[root@VM-0-123-centos jaeger]# kubectl get all -n jaeger
NAME                                         READY   STATUS        RESTARTS   AGE
pod/jaeger-operator-6ff67bdd4b-4nffk         1/1     Running       0          14d
pod/simple-prod-collector-59fc47bf5c-h26mq   0/1     Terminating   0          9d

NAME                              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/jaeger-operator-metrics   ClusterIP   172.20.253.138   <none>        8383/TCP,8686/TCP   14d

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/jaeger-operator   1/1     1            1           14d

NAME                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/jaeger-operator-6ff67bdd4b   1         1         1       14d

3.創建jaeger實例
創建jaeger.yaml文件,配置ES集羣及限制Deployment/simple-prod-collector容器的cpu和內存使用大小。最大數量可以起10個pod。

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-prod
spec:
  strategy: production
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: http://10.0.16.3:9200
        index-prefix: zhjt
  collector:
    maxReplicas: 10
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
[root@VM-0-123-centos jaeger]# kubectl apply -f  jaeger.yaml  -n jaeger
jaeger.jaegertracing.io/simple-prod created

列出jaeger對象
備註:貌似使用官網all in one的例子狀態是正常的Running,這裏狀態雖然是Failed,但是不影響使用。

[root@VM-0-123-centos jaeger]# kubectl get jaegers -n jaeger
NAME          STATUS   VERSION   STRATEGY     STORAGE         AGE
simple-prod   Failed   1.22.0    production   elasticsearch   9d

獲取pod名字

[root@VM-0-123-centos jaeger]# kubectl get pods -l app.kubernetes.io/instance=simple-prod -n jaeger
NAME                                              READY   STATUS      RESTARTS   AGE
simple-prod-collector-59fc47bf5c-h26mq            1/1     Running     0          9d
simple-prod-query-85689b7bbd-g5jw9                2/2     Running     0          9d

獲取pod日誌

[root@VM-0-123-centos jaeger]# kubectl  logs simple-prod-query-85689b7bbd-g5jw9 jaeger-agent  -n jaeger
2021/04/28 04:55:34 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined
{"level":"info","ts":1619585734.2081811,"caller":"flags/service.go:117","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1619585734.2082183,"caller":"flags/service.go:123","msg":"Mounting expvar handler on admin server","route":"/debug/vars"}
{"level":"info","ts":1619585734.2083232,"caller":"flags/admin.go:105","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1619585734.2083883,"caller":"flags/admin.go:111","msg":"Starting admin HTTP server","http-addr":":14271"}
{"level":"info","ts":1619585734.2084124,"caller":"flags/admin.go:97","msg":"Admin server started","http.host-port":"[::]:14271","health-status":"unavailable"}
{"level":"info","ts":1619585734.2089527,"caller":"grpc/builder.go:70","msg":"Agent requested insecure grpc connection to collector(s)"}
{"level":"info","ts":1619585734.2089992,"caller":"[email protected]/clientconn.go:243","msg":"parsed scheme: \"dns\"","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.21038,"caller":"command-line-arguments/main.go:84","msg":"Starting agent"}
{"level":"info","ts":1619585734.2104166,"caller":"healthcheck/handler.go:128","msg":"Health Check state change","status":"ready"}
{"level":"info","ts":1619585734.2108943,"caller":"grpc/builder.go:108","msg":"Checking connection to collector"}
{"level":"info","ts":1619585734.210908,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"IDLE"}
{"level":"info","ts":1619585734.211061,"caller":"app/agent.go:69","msg":"Starting jaeger-agent HTTP server","http-port":5778}
{"level":"info","ts":1619585734.3344934,"caller":"[email protected]/resolver_conn_wrapper.go:143","msg":"ccResolverWrapper: sending update to cc: {[{172.20.0.88:14250  <nil> 0 <nil>}] <nil> <nil>}","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3345578,"caller":"[email protected]/clientconn.go:667","msg":"ClientConn switching balancer to \"round_robin\"","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3345697,"caller":"[email protected]/clientconn.go:682","msg":"Channel switches to new LB policy \"round_robin\"","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3346283,"caller":"[email protected]/clientconn.go:1056","msg":"Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.33467,"caller":"[email protected]/clientconn.go:1193","msg":"Subchannel picks a new address \"172.20.0.88:14250\" to connect","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.334736,"caller":"[email protected]/clientconn.go:417","msg":"Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3347983,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"CONNECTING"}
{"level":"info","ts":1619585734.335669,"caller":"[email protected]/clientconn.go:1056","msg":"Subchannel Connectivity change to READY","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3357751,"caller":"base/balancer.go:200","msg":"roundrobinPicker: newPicker called with info: {map[0xc0002f5ea0:{{172.20.0.88:14250  <nil> 0 <nil>}}]}","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3357947,"caller":"[email protected]/clientconn.go:417","msg":"Channel Connectivity change to READY","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.335807,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"READY"}
{"level":"info","ts":1619592172.4516647,"caller":"[email protected]/clientconn.go:1056","msg":"Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4517512,"caller":"[email protected]/clientconn.go:1193","msg":"Subchannel picks a new address \"172.20.0.88:14250\" to connect","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4517596,"caller":"base/balancer.go:200","msg":"roundrobinPicker: newPicker called with info: {map[]}","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4517772,"caller":"[email protected]/clientconn.go:417","msg":"Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4517884,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"CONNECTING"}
{"level":"warn","ts":1619592172.4523218,"caller":"[email protected]/clientconn.go:1275","msg":"grpc: addrConn.createTransport failed to connect to {172.20.0.88:14250  <nil> 0 <nil>}. Err: connection error: desc = \"transport: Error while dialing dial tcp 172.20.0.88:14250: connect: connection refused\". Reconnecting...","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4523551,"caller":"[email protected]/clientconn.go:1056","msg":"Subchannel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.452386,"caller":"[email protected]/clientconn.go:417","msg":"Channel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4523947,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"TRANSIENT_FAILURE"}
{"level":"info","ts":1619592172.6118224,"caller":"[email protected]/resolver_conn_wrapper.go:143","msg":"ccResolverWrapper: sending update to cc: {[{172.20.0.178:14250  <nil> 0 <nil>}] <nil> <nil>}","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6118581,"caller":"[email protected]/clientconn.go:1056","msg":"Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6118758,"caller":"[email protected]/clientconn.go:1056","msg":"Subchannel Connectivity change to SHUTDOWN","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.611892,"caller":"[email protected]/clientconn.go:417","msg":"Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6119003,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"CONNECTING"}
{"level":"info","ts":1619592172.6119049,"caller":"[email protected]/clientconn.go:1193","msg":"Subchannel picks a new address \"172.20.0.178:14250\" to connect","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.612726,"caller":"[email protected]/clientconn.go:1056","msg":"Subchannel Connectivity change to READY","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6127572,"caller":"base/balancer.go:200","msg":"roundrobinPicker: newPicker called with info: {map[0xc0003df970:{{172.20.0.178:14250  <nil> 0 <nil>}}]}","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6127682,"caller":"[email protected]/clientconn.go:417","msg":"Channel Connectivity change to READY","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6127849,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"READY"}
[root@VM-0-123-centos jaeger]# kubectl  logs simple-prod-query-85689b7bbd-g5jw9 jaeger-query   -n jaeger
2021/04/28 04:55:29 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined
{"level":"info","ts":1619585729.8951077,"caller":"flags/service.go:117","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1619585729.8951416,"caller":"flags/service.go:123","msg":"Mounting expvar handler on admin server","route":"/debug/vars"}
{"level":"info","ts":1619585729.8952546,"caller":"flags/admin.go:105","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1619585729.8953054,"caller":"flags/admin.go:111","msg":"Starting admin HTTP server","http-addr":":16687"}
{"level":"info","ts":1619585729.8953238,"caller":"flags/admin.go:97","msg":"Admin server started","http.host-port":"[::]:16687","health-status":"unavailable"}
{"level":"info","ts":1619585729.9169888,"caller":"config/config.go:183","msg":"Elasticsearch detected","version":7}
{"level":"info","ts":1619585729.9174955,"caller":"app/static_handler.go:181","msg":"UI config path not provided, config file will not be watched"}
{"level":"info","ts":1619585729.9175768,"caller":"app/server.go:170","msg":"Query server started"}
{"level":"info","ts":1619585729.9175944,"caller":"healthcheck/handler.go:128","msg":"Health Check state change","status":"ready"}
{"level":"info","ts":1619585729.9176183,"caller":"app/server.go:249","msg":"Starting GRPC server","port":16685,"addr":":16685"}
{"level":"info","ts":1619585729.9176335,"caller":"app/server.go:230","msg":"Starting HTTP server","port":16686,"addr":":16686"}

4.查看jaeger資源

[root@VM-0-123-centos jaeger]# kubectl get all -n jaeger
NAME                                                  READY   STATUS      RESTARTS   AGE
pod/jaeger-operator-6ff67bdd4b-4nffk                  1/1     Running     0          14d
pod/simple-prod-collector-59fc47bf5c-h26mq            1/1     Running     0          8d
pod/simple-prod-query-85689b7bbd-g5jw9                2/2     Running     0          8d
 
NAME                                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                  AGE
service/jaeger-operator-metrics          ClusterIP   172.20.253.138   <none>        8383/TCP,8686/TCP                        14d
service/simple-prod-collector            ClusterIP   172.20.255.184   <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP   8d
service/simple-prod-collector-headless   ClusterIP   None             <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP   8d
service/simple-prod-query                ClusterIP   172.20.254.102   <none>        16686/TCP                                8d
 
NAME                                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/jaeger-operator         1/1     1            1           14d
deployment.apps/simple-prod-collector   1/1     1            1           8d
deployment.apps/simple-prod-query       1/1     1            1           8d
 
NAME                                               DESIRED   CURRENT   READY   AGE
replicaset.apps/jaeger-operator-6ff67bdd4b         1         1         1       14d
replicaset.apps/simple-prod-collector-59fc47bf5c   1         1         1       8d
replicaset.apps/simple-prod-query-85689b7bbd       1         1         1       8d
 
NAME                                                        REFERENCE                          TARGETS             MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/simple-prod-collector   Deployment/simple-prod-collector   1457m/90, 137m/90   1         10        1          8d

如果流量大需要減小es壓力,可以接入kafka集羣,修改jaeger.yaml文件

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-streaming
spec:
  strategy: streaming
  collector:
    options:
      kafka:
        producer:
          topic: jaeger-spans
          brokers: my-cluster-kafka-brokers.kafka:9092   #修改爲kafka地址
  ingester:
    options:
      kafka:
        consumer:
          topic: jaeger-spans
          brokers: my-cluster-kafka-brokers.kafka:9092  #修改爲kafka地址
      ingester:
        deadlockInterval: 5s
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: http://elasticsearch:9200   #修改爲ES地址

5.agent部署

jaeger client的一個代理程序,client將收集到的調用鏈數據發給agent,然後由agent發給collector。由於使用的udp協議,一般部署在靠近client的位置。

agent有多種安裝方式

1).docker安裝

下載:jaegertracing/jaeger-agent Tags (docker.com)

docker run -d -p 6831:6831/udp -p 6832:6832/udp -p 5778:5778/tcp jaegertracing/jaeger-agent:1.12 --reporter.grpc.host-port=xx.xx.xx.xx:14250

2).k8s安裝又分兩種

sidecar方式

daemonset方式

參考:Operator for Kubernetes — Jaeger documentation (jaegertracing.io)

3).二進制安裝

下載:Jaeger – Download Jaeger (jaegertracing.io)

nohup ./jaeger-agent --collector.host-port=xxxx:14267 1>1.log 2>2.log &

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章