kubernetes 升級指南

零、序

  • 持續更新,不再單開文章。
  • 操作系統centos7
  • 文章的命令根據自身環境修改,別照抄。這是我爲了寫blog方便從官方copy的
  • 版本從1.13.12逐步升級到1.17.5

    一 、升級準備

  • 確保集羣是 kubeadm 搭建的
  • 確保當前集羣已經完成 HA(多個 master 節點)
  • 確保做好了集羣備份

二、升級注意事項

  • 升級後所有集羣組件 Pod 會重啓(hash 變更)
  • 升級時 kubeadm 版本必須大於或等於目標版本
  • 升級期間所有 kube-proxy 組件會有一次全節點滾動更新
  • 升級只支持小版本進行,不支持跨版本升級(You only can upgrade from one MINOR version to the next MINOR version, or between PATCH versions of the same MINOR. That is, you cannot skip MINOR versions when you upgrade. For example, you can upgrade from 1.y to 1.y+1, but not from 1.y to 1.y+2.)

    關於升級版本問題,意思是可以從1.16.x 升級到1.16.y,或者升級到1.17.x。但是不支持從1.16直接升級到1.18

三、升級master

3.1、升級 kubeadm、kubectl

首先將 kubeadm 和 kubectl 升級到大於目標版本

yum versionlock delete kubectl kubeadm
yum install -y kubeadm-1.17.5 --disableexcludes=kubernetes
yum versionlock kubeadm kubelet

versionlock 你可以選擇不用,因爲我的服務器會不時的update,避免別人誤升級(就是怕有人手賤)

3.2、升級前準備

3.2.1、配置修改

基本上集羣都要自定義的,默認配置不是說不能用,而是不太符合生產環境。
文章最後會附上所有版本升級需要的kubeadm-config

3.2.2、節點驅逐

如果master 節點有作爲node跑的pod,則需要執行以下命令驅逐這些 pod 並使節點進入維護模式(禁止調度)。

# 將 cp-node-name 換成 Master 節點名稱
kubectl drain cp-node-name --ignore-daemonsets

3.2.3、查看升級計劃

通過以下命令查看升級計劃;升級計劃中列出了升級期間要升級的所有組件以及相關警告,一定要仔細查看。

[root@kubernetes]# kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.17.5
[upgrade/versions] kubeadm version: v1.18.2
[upgrade/versions] Latest stable version: v1.18.2
[upgrade/versions] Latest stable version: v1.18.2
[upgrade/versions] Latest version in the v1.17 series: v1.17.5
[upgrade/versions] Latest version in the v1.17 series: v1.17.5

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT   CURRENT       AVAILABLE
Kubelet     1 x v1.17.5   v1.18.2

Upgrade to the latest stable version:

COMPONENT            CURRENT   AVAILABLE
API Server           v1.17.5   v1.18.2
Controller Manager   v1.17.5   v1.18.2
Scheduler            v1.17.5   v1.18.2
Kube Proxy           v1.17.5   v1.18.2
CoreDNS              1.6.5     1.6.7
Etcd                 3.4.3     3.4.3-0

You can now apply the upgrade by executing the following command:

        kubeadm upgrade apply v1.18.2

_____________________________________________________________________

3.3、執行升級

如果你的etcd 集羣不是kubeadm創建的, 需要先手動升級etcd集羣。然後再執行後面的步驟

kubeadm upgrade apply v1.17.5 --config /etc/kubernetes/kubeadm.yaml

3.4、升級 kubelet

在單個 master 上升級完成後,只會升級本節點的 master 相關組件和全節點的 kube-proxy 組件;確定沒問題後再更新kubelet
解除驅逐

# replace x in 1.17.x-0 with the latest patch version
yum versionlock delete kubelet
yum install -y kubelet-1.17.5 --disableexcludes=kubernetes
yum versionlock kubelet

更新完成後執行 並等待啓動成功

systemctl daemon-reload
systemctl restart kubelet

別忘了解除當前節點的維護模式(uncordon)

# replace <cp-node-name> with the name of your control plane node
kubectl uncordon <cp-node-name>

3.5、升級其他 Master

步驟同第一個master差不多,只是把kubeadm upgrade plan 替換成 kubeadm upgrade node
因爲apiserver 等組件配置已經在升級第一個 master 時上傳到了集羣的 configMap 中,所以事實上其他 master 節點只是正常拉取然後重啓相關組件既可;這一步同樣會輸出詳細日誌,可以仔細觀察進度,最後不要忘記升級之前先進入維護模式,升級完成後重新安裝 kubelet 並關閉節點維護模式。

四、升級 Node

node 節點的升級在升級完 master 節點以後不需要什麼特殊操作,唯一需要升級的就是 kubelet 組件;首先在 node 節點執行 kubeadm upgrade node 命令,該命令會拉取集羣內的 kubelet 配置文件,然後重新安裝 kubelet 重啓既可;同樣升級 node 節點時不要忘記開啓維護模式。針對於 CNI 組件請按需手動升級,並且確認好 CNI 組件的兼容版本。

五、驗證集羣

查看集羣node都是ready,同時版本號也是你升級後的

kubectl get nodes

從錯誤狀態中恢復

如果 kubeadm upgrade 執行過程中出現錯誤且未曾回滾,例如執行過程中意外關機,您可以再次執行 kubeadm upgrade。該命令是 冪等 的,並將最終保證您能夠達到最終期望的升級結果。

從失敗狀態中恢復時,請執行 kubeadm upgrade --force 命令,注意要使用集羣的當前版本號。

工作過程

在第一個 master 節點上,kubeadm upgrade apply 執行了如下操作:

檢查集羣是否處於可升級的狀態:
API Server 可以調用
所有的節點處於 Ready 裝填
master 節點處於 healthy 狀態
檢驗是否可以從當前版本升級到目標版本
確保 master 節點所需要的鏡像可以被抓取到節點上
升級 master 節點的組件,(如果碰到問題,則回滾)
應用新的 kube-dns 和 kube-proxy 的 manifests 文件,並確保需要的 RBAC 規則被創建
如果證書在 180 天內將要過期,則爲 API Server 創建新的證書文件,並備份舊的文件
在其他 master 節點上,kubeadm upgrade node 執行了如下操作:

從集羣中抓取 kubeadm 的配置信息 ClusterConfiguration
備份 kube-apiserver 的證書
升級 master 節點上靜態組件的 manifest 信息
升級 master 節點上 kubelet 的配置信息
在所有的 worker 節點上,kubeadm upgrade node 執行了如下操作:

從集羣中抓取 kubeadm 的配置信息 ClusterConfiguration
升級 worker 節點上 kubelet 的配置信息

1.16.8 升級到1.17.5

配置kubeadm-config文件

apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: "0"
  usages:
  - signing
  - authentication
---
imageRepository: harbor.foxchan.com/google_containers
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.17.5
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: k8s.foxchan.com:8443
etcd:
    external:
        endpoints:
        - http://172.16.242.12:2379
        - http://172.16.242.16:2379
        - http://172.16.242.44:2379
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
apiServer:
  extraArgs:
    v: "2"
    logtostderr: "false"
    log-dir: "/var/log/kubernetes"
  extraVolumes:
  - name: "k8s-log"
    hostPath: "/var/log/kubernetes"
    mountPath: "/var/log/kubernetes"
    pathType: "DirectoryOrCreate"
  - name: "timezone"
    hostPath: "/etc/localtime"
    mountPath: "/etc/localtime"
    readOnly: true
    pathType: "File"
  timeoutForControlPlane: 4m0s
  certSANs:
  - k8s.foxchan.com
  - "172.16.242.16"
  - "172.16.242.12"
  - "172.16.242.17"
controllerManager:
  extraArgs:
    address: 0.0.0.0
    experimental-cluster-signing-duration: "87600h"
    v: "2"
    logtostderr: "false"
    log-dir: "/var/log/kubernetes"
  extraVolumes:
  - name: "k8s-log"
    hostPath: "/var/log/kubernetes"
    mountPath: "/var/log/kubernetes"
    pathType: "DirectoryOrCreate"
  - name: "timezone"
    hostPath: "/etc/localtime"
    mountPath: "/etc/localtime"
    readOnly: true
    pathType: "File"
scheduler:
  extraArgs:
    address: 0.0.0.0
    v: "2"
    logtostderr: "false"
    log-dir: "/var/log/kubernetes"
  extraVolumes:
  - name: "k8s-log"
    hostPath: "/var/log/kubernetes"
    mountPath: "/var/log/kubernetes"
    pathType: "DirectoryOrCreate"
  - name: "timezone"
    hostPath: "/etc/localtime"
    mountPath: "/etc/localtime"
    readOnly: true
    pathType: "File"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
failSwapOn: false
cgroupDriver: systemd
rotateCertificates: true
# 一些驅逐閥值,具體自行查文檔修改
evictionHard:
  "imagefs.available": "8%"
  "memory.available": "256Mi"
  "nodefs.available": "8%"
  "nodefs.inodesFree": "5%"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
# kube-proxy specific options here
clusterCIDR: "10.244.0.0/16"
# 啓用 ipvs 模式
mode: "ipvs"
ipvs:
  minSyncPeriod: 5s
  syncPeriod: 5s
  # ipvs 負載策略
  scheduler: "wrr"

升級日誌

從1.17 開始 通過upgrade 修改集羣config 是不被推薦的,儘管還能用

kube-proxy config 和 kubelet config 已經不再 kubeadm-config 配置

kube-proxy 和kubelet config 需要些config.yaml ,來替換系統的 configmap

1.17 修復了kubeadm alpha certs check-expiration 查看證書有效期, etcd組件是外部無法查看的問題

官方issue: https://github.com/kubernetes/kubeadm/issues/1850

1.15.6升級到1.16.8

配置文件如下

apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: "0"
  usages:
  - signing
  - authentication
---
imageRepository: harbor.foxchan.com/google_containers
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.16.8
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: k8s.foxchan.com:8443
etcd:
    external:
        endpoints:
        - http://172.16.242.12:2379
        - http://172.16.242.16:2379
        - http://172.16.242.44:2379
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
apiServer:
  extraArgs:
    v: "2"
    logtostderr: "false"
    log-dir: "/var/log/kubernetes"
    authorization-mode: Node,RBAC
  extraVolumes:
  - name: "k8s-log"
    hostPath: "/var/log/kubernetes"
    mountPath: "/var/log/kubernetes"
    pathType: "DirectoryOrCreate"
  - name: "timezone"
    hostPath: "/etc/localtime"
    mountPath: "/etc/localtime"
    readOnly: true
    pathType: "File"
  timeoutForControlPlane: 4m0s
  certSANs:
  - k8s.foxchan.com
  - "172.16.242.16"
  - "172.16.242.12"
  - "172.16.242.17"
controllerManager:
  extraArgs:
    address: 0.0.0.0
    experimental-cluster-signing-duration: "87600h"
    v: "2"
    logtostderr: "false"
    log-dir: "/var/log/kubernetes"
  extraVolumes:
  - name: "k8s-log"
    hostPath: "/var/log/kubernetes"
    mountPath: "/var/log/kubernetes"
    pathType: "DirectoryOrCreate"
  - name: "timezone"
    hostPath: "/etc/localtime"
    mountPath: "/etc/localtime"
    readOnly: true
    pathType: "File"
scheduler:
  extraArgs:
    address: 0.0.0.0
    v: "2"
    logtostderr: "false"
    log-dir: "/var/log/kubernetes"
  extraVolumes:
  - name: "k8s-log"
    hostPath: "/var/log/kubernetes"
    mountPath: "/var/log/kubernetes"
    pathType: "DirectoryOrCreate"
  - name: "timezone"
    hostPath: "/etc/localtime"
    mountPath: "/etc/localtime"
    readOnly: true
    pathType: "File"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
failSwapOn: false
cgroupDriver: systemd
rotateCertificates: true
# 一些驅逐閥值,具體自行查文檔修改
evictionHard:
  "imagefs.available": "8%"
  "memory.available": "256Mi"
  "nodefs.available": "8%"
  "nodefs.inodesFree": "5%"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
# kube-proxy specific options here
clusterCIDR: "10.244.0.0/16"
# 啓用 ipvs 模式
mode: "ipvs"
ipvs:
  minSyncPeriod: 5s
  syncPeriod: 5s
  # ipvs 負載策略
  scheduler: "wrr"

升級日誌

出現error

官方issue:https://github.com/kubernetes/kubernetes/issues/82889

主要是coredns 插件從proxy替換爲forward

可以忽略報錯,在升級集羣的時候 會自動替換配置

kubeadm upgrade plan --config kubeadm1.16-config.yaml --ignore-preflight-errors=CoreDNSUnsupportedPlugins

或者修改cm ,把proxy 替換爲forward

kubectl -n kube-system get cm coredns -oyaml

在升級完集羣后,

node not ready.報錯:

plugin flannel does not support config version

添加"cniVersion":"0.3.1" 到 /etc/cni/net.d/10-flannel.conflist

{
  "name": "cbr0",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "flannel",
      "delegate": {
        "hairpinMode": true,
        "isDefaultGateway": true
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    }
  ]
}

1.16版本kubernetes需要在配置文件中指定cni版本

發現 執行命令查看cs 報unknown,

官方issue https://github.com/kubernetes/kubernetes/issues/83024

在1.17版本修復

[root@]# kubectl get cs
NAME AGE
scheduler <unknown>
controller-manager <unknown>
etcd-0 <unknown>
etcd-2 <unknown>
etcd-1 <unknown>
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章