本文要講的是k8s的故障排除,比較淺,最近剛入門。主要涵蓋的內容是查看k8s對象的當前運行時信息;對於服務、容器的問題是如何診斷的;對於某些複雜的問題例如pod調度問題是如何排查的。
1、查看系統的Event事件
在對象資源(pod,service,RC,node,namespace,deployment等)運行有問題時,例如pod創建後沒有成功運行,都應該查看k8s對象的當前運行時信息,特別是與對象關聯的Event事件。這些事件記錄了相關主題、發生時段、最近發生時間、發生次數和時間原因等。
k8s提供一下命令來查看對象運行狀態:
kubectl describe pod xxxx
kubectl describe node xxxx
結果如下:
[root@centos ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
curl-5f8bff6547-rb4qk 1/1 Running 2 3d14h
redis-master-7j8cm 1/1 Running 2 3d14h
webapp-j7gd2 1/1 Running 3 3d21h
webapp-kzrn7 1/1 Running 3 3d14h
[root@centos ~]# kubectl describe pod webapp-j7gd2
Name: webapp-j7gd2
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: node3/192.168.195.138
Start Time: Mon, 08 Apr 2019 13:19:25 +0800
Labels: app=webapp
Annotations: <none>
Status: Running
IP: 10.244.1.35
Controlled By: ReplicationController/webapp
Containers:
webapp:
Container ID: docker://e4dd5ec51e4d05456bd1605459a252085ad092c6be26e2becd5301114a470a33
Image: tomcat:9-jre8-alpine
Image ID: docker-pullable://tomcat@sha256:67fc2a0a54f9dfa7abda85a2900d721a55115dcae8ca7da560e65d15ca4c8aa7
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Thu, 11 Apr 2019 09:26:42 +0800
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Mon, 08 Apr 2019 21:52:27 +0800
Finished: Thu, 11 Apr 2019 09:25:55 +0800
Ready: True
Restart Count: 3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nx72w (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-nx72w:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nx72w
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
最後一行的event信息比較難重要,我這個pod是沒有問題的,所以沒啥信息,如果你的pod有一場的話,這邊是會有錯誤信息的。然後錯誤信息是英文的,你一看就知道是什麼問題。一般是鏡像拉不到啥的,沒有可用的node等等。如果你的pod是在某個namespace下的,不是default命名空間下的,那就需要用一下命令來指定命名空間:
kubectl describe pod xxx -n 你的命名空間
2、查看容器的日誌
在需要排查容器內部應用程序生成的日誌時,可以使用kubectl logs <pod-name>命令,例如:
[root@centos ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
curl-5f8bff6547-rb4qk 1/1 Running 2 3d14h
redis-master-7j8cm 1/1 Running 2 3d14h
webapp-j7gd2 1/1 Running 3 3d21h
webapp-kzrn7 1/1 Running 3 3d14h
[root@centos ~]# kubectl logs webapp-j7gd2
11-Apr-2019 01:26:45.108 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version name: Apache Tomcat/9.0.17
11-Apr-2019 01:26:45.145 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server built: Mar 13 2019 15:55:27 UTC
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version number: 9.0.17.0
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Name: Linux
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Version: 3.10.0-957.el7.x86_64
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Architecture: amd64
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Java Home: /usr/lib/jvm/java-1.8-openjdk/jre
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Version: 1.8.0_201-b08
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Vendor: Oracle Corporation
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_BASE: /usr/local/tomcat
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_HOME: /usr/local/tomcat
11-Apr-2019 01:26:45.148 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties
11-Apr-2019 01:26:45.148 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
11-Apr-2019 01:26:45.148 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djdk.tls.ephemeralDHKeySize=2048
11-Apr-2019 01:26:45.149 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.protocol.handler.pkgs=org.apache.catalina.webresources
11-Apr-2019 01:26:45.149 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dorg.apache.catalina.security.SecurityListener.UMASK=0027
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dignore.endorsed.dirs=
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.base=/usr/local/tomcat
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.home=/usr/local/tomcat
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.io.tmpdir=/usr/local/tomcat/temp
11-Apr-2019 01:26:45.151 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent Loaded APR based Apache Tomcat Native library [1.2.21] using APR version [1.6.5].
11-Apr-2019 01:26:45.151 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
11-Apr-2019 01:26:45.151 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR/OpenSSL configuration: useAprConnector [false], useOpenSSL [true]
11-Apr-2019 01:26:45.160 INFO [main] org.apache.catalina.core.AprLifecycleListener.initializeSSL OpenSSL successfully initialized [OpenSSL 1.1.1b 26 Feb 2019]
11-Apr-2019 01:26:45.606 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:45.678 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:45.689 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [2,071] milliseconds
11-Apr-2019 01:26:45.755 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
11-Apr-2019 01:26:45.755 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/9.0.17]
11-Apr-2019 01:26:45.777 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/ROOT]
11-Apr-2019 01:26:46.985 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/ROOT] has finished in [1,202] ms
11-Apr-2019 01:26:46.986 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/docs]
11-Apr-2019 01:26:47.071 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/docs] has finished in [86] ms
11-Apr-2019 01:26:47.080 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/examples]
11-Apr-2019 01:26:48.100 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/examples] has finished in [1,020] ms
11-Apr-2019 01:26:48.104 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/host-manager]
11-Apr-2019 01:26:48.169 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/host-manager] has finished in [65] ms
11-Apr-2019 01:26:48.169 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/manager]
11-Apr-2019 01:26:48.227 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/manager] has finished in [58] ms
11-Apr-2019 01:26:48.235 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:48.302 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:48.323 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [2,633] milliseconds
如果在一個pod中包含多個容器,則需要通過-c參數來指定容器的名稱來進行查看,例如:
kubectl logs <pod_name> -c <container_name>
當然也可以直接直用docker logs <container_id>
[root@node2 ~]# docker ps | grep web
6041a63c30ea 6097ab3c4283 "catalina.sh run" 25 hours ago Up 25 hours k8s_webapp_webapp-kzrn7_default_7c476613-59f4-11e9-9a41-000c29f1f0e4_3
974390ced06b k8s.gcr.io/pause:3.1 "/pause" 25 hours ago Up 25 hours k8s_POD_webapp-kzrn7_default_7c476613-59f4-11e9-9a41-000c29f1f0e4_7
[root@node2 ~]# docker logs 6041a63c30ea
11-Apr-2019 01:26:33.432 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version name: Apache Tomcat/9.0.17
11-Apr-2019 01:26:33.526 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server built: Mar 13 2019 15:55:27 UTC
11-Apr-2019 01:26:33.526 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version number: 9.0.17.0
11-Apr-2019 01:26:33.526 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Name: Linux
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Version: 3.10.0-957.el7.x86_64
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Architecture: amd64
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Java Home: /usr/lib/jvm/java-1.8-openjdk/jre
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Version: 1.8.0_201-b08
11-Apr-2019 01:26:33.528 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Vendor: Oracle Corporation
11-Apr-2019 01:26:33.528 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_BASE: /usr/local/tomcat
11-Apr-2019 01:26:33.528 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_HOME: /usr/local/tomcat
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djdk.tls.ephemeralDHKeySize=2048
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.protocol.handler.pkgs=org.apache.catalina.webresources
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dorg.apache.catalina.security.SecurityListener.UMASK=0027
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dignore.endorsed.dirs=
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.base=/usr/local/tomcat
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.home=/usr/local/tomcat
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.io.tmpdir=/usr/local/tomcat/temp
11-Apr-2019 01:26:33.531 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent Loaded APR based Apache Tomcat Native library [1.2.21] using APR version [1.6.5].
11-Apr-2019 01:26:33.539 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
11-Apr-2019 01:26:33.540 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR/OpenSSL configuration: useAprConnector [false], useOpenSSL [true]
11-Apr-2019 01:26:33.565 INFO [main] org.apache.catalina.core.AprLifecycleListener.initializeSSL OpenSSL successfully initialized [OpenSSL 1.1.1b 26 Feb 2019]
11-Apr-2019 01:26:34.291 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:34.374 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:34.378 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [3,215] milliseconds
11-Apr-2019 01:26:34.467 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
11-Apr-2019 01:26:34.468 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/9.0.17]
11-Apr-2019 01:26:34.507 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/ROOT]
11-Apr-2019 01:26:36.293 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/ROOT] has finished in [1,786] ms
11-Apr-2019 01:26:36.294 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/docs]
11-Apr-2019 01:26:36.368 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/docs] has finished in [73] ms
11-Apr-2019 01:26:36.377 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/examples]
11-Apr-2019 01:26:37.797 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/examples] has finished in [1,420] ms
11-Apr-2019 01:26:37.802 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/host-manager]
11-Apr-2019 01:26:38.031 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/host-manager] has finished in [228] ms
11-Apr-2019 01:26:38.032 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/manager]
11-Apr-2019 01:26:38.161 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/manager] has finished in [128] ms
11-Apr-2019 01:26:38.183 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:38.244 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:38.290 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [3,911] milliseconds
3、查看k8s的服務日誌
如果在linux系統上進行安裝,並且是使用systemd系統來管理k8s服務,那麼systemd的journal系統會接管服務程序的輸出日誌。可以使用systemd status 或者systemctl status或者journalctl查看系統服務日誌:
[root@node2 ~]# systemctl status kubelet.service
Display all 502 possibilities? (y or n)
[root@node2 ~]# systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Thu 2019-04-11 09:25:36 CST; 1 day 1h ago
Docs: https://kubernetes.io/docs/
Main PID: 7793 (kubelet)
Tasks: 19
Memory: 112.4M
CGroup: /system.slice/kubelet.service
└─7793 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/v...
Apr 12 09:56:44 node2 kubelet[7793]: W0412 09:56:44.886746 7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (562273)
Apr 12 09:57:46 node2 kubelet[7793]: W0412 09:57:46.933029 7793 reflector.go:270] object-"kube-system"/"kube-flannel-cfg": watch of *v1.Co... (562359)
Apr 12 10:04:45 node2 kubelet[7793]: W0412 10:04:45.828641 7793 reflector.go:270] object-"kube-system"/"coredns": watch of *v1.ConfigMap e... (562964)
Apr 12 10:11:04 node2 kubelet[7793]: W0412 10:11:04.635497 7793 reflector.go:270] object-"kube-system"/"kube-flannel-cfg": watch of *v1.Co... (563510)
Apr 12 10:12:23 node2 kubelet[7793]: W0412 10:12:23.593624 7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (563619)
Apr 12 10:24:09 node2 kubelet[7793]: W0412 10:24:09.875061 7793 reflector.go:270] object-"kube-system"/"coredns": watch of *v1.ConfigMap e... (564637)
Apr 12 10:26:55 node2 kubelet[7793]: W0412 10:26:55.642788 7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (564886)
Apr 12 10:28:14 node2 kubelet[7793]: W0412 10:28:14.693489 7793 reflector.go:270] object-"kube-system"/"kube-flannel-cfg": watch of *v1.Co... (564992)
Apr 12 10:43:12 node2 kubelet[7793]: W0412 10:43:12.893306 7793 reflector.go:270] object-"kube-system"/"coredns": watch of *v1.ConfigMap e... (566287)
Apr 12 10:43:37 node2 kubelet[7793]: W0412 10:43:37.662130 7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (566320)
Hint: Some lines were ellipsized, use -l to show in full.
或者
[root@centos ~]# journalctl -xeu kubelet
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.510165 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.610691 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.711008 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.811468 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: I0412 10:46:53.883382 9787 kubelet_node_status.go:278] Setting node annotation to enable volume controlle
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.912065 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: I0412 10:46:53.914043 9787 kubelet_node_status.go:72] Attempting to register node centos.master
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.916659 9787 kubelet_node_status.go:94] Unable to register node "centos.master" with API se
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.012363 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.113003 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: I0412 10:46:54.147210 9787 kubelet_node_status.go:278] Setting node annotation to enable volume controlle
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.213291 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.313616 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.413970 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.514292 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.615167 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.715863 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.816154 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.916432 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.017040 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.117863 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.218694 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.319663 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.420254 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.521053 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.621575 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.722435 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.823464 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.924273 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.024392 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.125129 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: I0412 10:46:56.146767 9787 kubelet_node_status.go:278] Setting node annotation to enable volume controlle
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.225839 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.326354 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.427552 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.528289 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.628843 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.729056 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.829340 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.929690 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.030373 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.131158 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.232373 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.333084 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.433269 9787 kubelet.go:2266] node "centos.master" not found
上面的kubelet服務日誌告訴我centos.master 的node找不到。
好了到這裏三板斧算是用完了。很簡單的三板斧,只能用於基本排查。
如果某個k8s對象存在問題而查看系統服務的日誌,則我們可以用這個對象的名字作爲關鍵字來搜索日誌,在大多數情況下,我麼平常所遇到的主要是與pod對象相關的問題,比如無法創建pod,pod啓動後就停止或者Pod副本無法增加等。此時,我們可以先確定哪個pod在哪個節點上,然後登陸這個節點,從kubelet的日誌中查詢該pod的完整日誌,然後進行問題排查。對於與pod擴容相關或者與RC相關的問題,則很有可能在kjbe-controller-manager及Kube-scheduler的日誌中找出問題的關鍵點。
另外kube-proxy經常被我們忽略,因爲就算他停了,pod的狀態依舊時正常的,但會導致某些服務訪問異常。