問題:節點用戶訓練cpu過高或者內存消耗過高會對節點系統進程以及kube進程產生影響。
解決:
Kubelet Node Allocatable
-
Kubelet Node Allocatable用來爲Kube組件和System進程預留資源,從而保證當節點出現滿負荷時也能保證Kube和System進程有足夠的資源。
-
目前支持cpu, memory, ephemeral-storage三種資源預留。
-
Node Capacity是Node的所有硬件資源,kube-reserved是給kube組件預留的資源,system-reserved是給System進程預留的資源, eviction-threshold是kubelet eviction的閾值設定,allocatable纔是真正scheduler調度Pod時的參考值(保證Node上所有Pods的request resource不超過Allocatable)。
-
Node Allocatable Resource = Node Capacity - Kube-reserved - system-reserved - eviction-threshold
如何配置 -
–enforce-node-allocatable,默認爲pods,要爲kube組件和System進程預留資源,則需要設置爲pods,kube-reserved,system-reserve。
-
–cgroups-per-qos,Enabling QoS and Pod level cgroups,默認開啓。開啓後,kubelet會將管理所有workload Pods的cgroups。
—cgroup-driver,默認爲cgroupfs,另一可選項爲systemd。取決於容器運行時使用的cgroup driver,kubelet與其保持一致。比如你配置docker使用systemd cgroup driver,那麼kubelet也需要配置–cgroup-driver=systemd。
-
–kube-reserved,用於配置爲kube組件(kubelet,kube-proxy,dockerd等)預留的資源量,比如—kube-reserved=cpu=1000m,memory=8Gi,ephemeral-storage=16Gi。
-
–kube-reserved-cgroup,如果你設置了–kube-reserved,那麼請一定要設置對應的cgroup,並且該cgroup目錄要事先創建好,否則kubelet將不會自動創建導致kubelet啓動失敗。比如設置爲kube-reserved-cgroup=/kubelet.service 。
-
—system-reserved,用於配置爲System進程預留的資源量,比如—system-reserved=cpu=500m,memory=4Gi,ephemeral-storage=4Gi。
-
–system-reserved-cgroup,如果你設置了–system-reserved,那麼請一定要設置對應的cgroup,並且該cgroup目錄要事先創建好,否則kubelet將不會自動創建導致kubelet啓動失敗。比如設置爲system-reserved-cgroup=/system.slice。
-
–eviction-hard,用來配置kubelet的hard eviction條件,只支持memory和ephemeral-storage兩種不可壓縮資源。當出現MemoryPressure時,Scheduler不會調度新的Best-Effort QoS Pods到此節點。當出現DiskPressure時,Scheduler不會調度任何新Pods到此節點。關於Kubelet Eviction的更多解讀,請參考我的相關博文。
-
Kubelet Node Allocatable的代碼很簡單,主要在pkg/kubelet/cm/node_container_manager.go,感興趣的同學自己去走讀一遍。
舉例:
[root@node177 system]# cat /etc/systemd/system/kubelet.service.d/10-kubelet.conf[Service] Environment="KUBELET_POD_INFRA_CONTAINER=--pod-infra-container-image=registry.bst-1.cns.bstjpc.com:5000/k8s.gcr.io/pause-amd64:3.1" #Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf" Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/admin.conf" Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true" Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local" Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=4194" Environment="KUBELET_VOLUME_ARGS=--volume-plugin-dir=/var/lib/kubelet/volumeplugins --feature-gates=DevicePlugins=true,BlockVolume=true,PodPriority=true --volume-stats-agg-period=0 " Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false --logtostderr=true --v=0" Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin" Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.pem" Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki" #ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_DNS_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_VOLUME_ARGS $KUBELET_EXTRA_ARGS
[root@node177 system]# cat /usr/lib/systemd/system/kubelet.service
[Unit]
Description=kubelet: The Kubernetes Node Agent
Documentation=http://kubernetes.io/docs/
#After=docker.service
#Wants=dcoker.service
[Service]
#ExecStart=/usr/local/bin/kubelet
#ExecStart=/usr/local/bin/kubelet $KUBELET_POD_INFRA_CONTAINER $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_DNS_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_VOLUME_ARGS $KUBELET_EXTRA_ARGS
ExecStartPre=/usr/bin/mkdir -p /sys/fs/cgroup/cpuset/system.slice/kubelet.service
ExecStartPre=/usr/bin/mkdir -p /sys/fs/cgroup/hugetlb/system.slice/kubelet.service
ExecStart=/usr/local/bin/kubelet $KUBELET_POD_INFRA_CONTAINER $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_DNS_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_VOLUME_ARGS $KUBELET_EXTRA_ARGS \
--cgroup-driver=cgroupfs \
--cgroup-root= \
--enforce-node-allocatable=pods,kube-reserved,system-reserved \
--kube-reserved-cgroup=/system.slice/kubelet.service \
--system-reserved-cgroup=/system.slice \
--kube-reserved=cpu=2,memory=8Gi \
--system-reserved=cpu=6,memory=24Gi \
Restart=always
StartLimitInterval=0
RestartSec=10
[Install]
WantedBy=multi-user.target