calico 2.6.1 升級至 3.11

說明

查看官方文檔升級的操作需要做如下注意事項。

  1. 2.6.x 與 3.x 使用的etcd(這裏只是針對 etcd 存儲來說) 是不同的,2.6 的使用的是 etcdv2, 而3.x 是 etcdv3.
  2. 如果想從 2.6.x 升級到 3.x 至少得是2.6.5+的才行。

所以針對現有的情況,需要先升級至 2.6.5+ ,再升級 3.x。

2.6.1 升級至 2.6.12

2019/12/25

現有環境,使用 etcdv2 進行存儲的 calico 數據。


[root@k8s-1 kubelet]# which etcdv2
alias etcdv2='export ETCDCTL_API=2; /bin/etcdctl --ca-file /etc/etcd/ssl/etcd-root-ca.pem --cert-file /etc/etcd/ssl/etcd.pem --key-file /etc/etcd/ssl/etcd-key.pem --endpoints https://10.111.32.239:2379,https://10.111.32.241:2379,https://10.111.32.242:2379'

[root@k8s-1 kubelet]# etcdv2 ls /calico/ipam/v2/assignment/ipv4
/calico/ipam/v2/assignment/ipv4/block
[root@k8s-1 kubelet]# etcdv2 ls /calico/ipam/v2/assignment/ipv4/block
/calico/ipam/v2/assignment/ipv4/block/10.20.134.64-26
/calico/ipam/v2/assignment/ipv4/block/10.20.253.64-26
/calico/ipam/v2/assignment/ipv4/block/10.20.28.192-26
/calico/ipam/v2/assignment/ipv4/block/10.20.51.128-26
/calico/ipam/v2/assignment/ipv4/block/10.20.78.0-26
/calico/ipam/v2/assignment/ipv4/block/10.20.112.64-26
/calico/ipam/v2/assignment/ipv4/block/10.20.15.128-26
/calico/ipam/v2/assignment/ipv4/block/10.20.235.0-26
/calico/ipam/v2/assignment/ipv4/block/10.20.53.64-26
/calico/ipam/v2/assignment/ipv4/block/10.20.72.128-26

根據文檔中的說明,升級至 3.0 需要至少 2.6.5+ ,且需要進行一些手動的操作,因爲 3.x 的使用 etcdv3, 而 2.6.x 的使用 etcdv2。

現在集羣使用的是 2.6.1 的版本,先將其升級至 2.6.5+。

這裏選擇 2.6 中最新的 2.6.12

下載 calico.yaml 文件


[root@docker-182 v2.6]# wget https://docs.projectcalico.org/v2.6/getting-started/kubernetes/installation/rbac.yaml

[root@docker-182 v2.6]# wget https://docs.projectcalico.org/v2.6/getting-started/kubernetes/installation/hosted/calico.yaml

# 更改 calico.yaml 中的配置
[root@docker-182 v2.6]# sh -x modify_calico_yaml.sh

預先拉取鏡像

[root@docker-182 v2.6]# grep image calico.yaml
          image: quay.io/calico/node:v2.6.12
          image: quay.io/calico/cni:v1.11.8
          image: quay.io/calico/kube-controllers:v1.0.5
          image: quay.io/calico/kube-controllers:v1.0.5

文檔中說的一些升級步驟,比如先升級 calico-kube-controllers ,再升級 calico-node 的daemonset ,這裏就直接 apply 新的資源文件

並不包含 calico 的 rbac 資源。

[root@docker-182 v2.6]# k239 apply -f calico.yaml
configmap "calico-config" unchanged
secret "calico-etcd-secrets" unchanged
daemonset "calico-node" configured
deployment "calico-kube-controllers" configured
deployment "calico-policy-controller" configured
serviceaccount "calico-kube-controllers" unchanged
serviceaccount "calico-node" unchanged

提交更新

提交之後, daemonset 的 calico-node 並沒有更新,現在刪除 pod ,使其更新

[root@k8s-1 v2.6]# kubectl -n kube-system get pod -o wide |grep calico
calico-kube-controllers-6768b96c5f-rdbjp   1/1       Running             0          4m        10.111.32.243   k8s-4.geotmt.com
calico-node-45lnh                          0/1       ContainerCreating   0          4h        10.111.32.241   k8s-2.geotmt.com
calico-node-49mq7                          1/1       Running             1          5h        10.111.32.243   k8s-4.geotmt.com
calico-node-m86hr                          1/1       Running             0          5h        10.111.32.244   k8s-5.geotmt.com
calico-node-mm5fz                          0/1       ContainerCreating   0          4h        10.111.32.239   k8s-1.geotmt.com
calico-node-shrfw                          1/1       Running             0          4h        10.111.32.242   k8s-3.geotmt.com
calico-node-xx8hk                          1/1       Running             0          5h        10.111.32.245   k8s-6.geotmt.com

更新後的測試

其中一個的示例,新的 calico-node 其中有兩個容器。

[root@k8s-1 v2.6]# kubectl -n kube-system get pod -o wide |grep calico |grep k8s-6
calico-node-fj4t8                          2/2       Running             0          25s       10.111.32.245   k8s-6.geotmt.com

測試 ping 其他節點的 pod 正常

bash-4.4# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if30: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether 6e:20:a3:45:42:49 brd ff:ff:ff:ff:ff:ff
    inet 10.20.235.12/32 scope global eth0
       valid_lft forever preferred_lft forever
bash-4.4# ping 10.20.15.135
PING 10.20.15.135 (10.20.15.135): 56 data bytes
64 bytes from 10.20.15.135: seq=0 ttl=62 time=1.133 ms
64 bytes from 10.20.15.135: seq=1 ttl=62 time=0.631 ms

這個版本的仍需手動添加 toleration,以便在 master 節點上部署 pod。

升級至 2.6.12 完成。

2.6.12 升級至 3.0

  • 升級前的注意事項
    • You must first upgrade to Calico v2.6.5 (or a later v2.6.x release) before you can upgrade to Calico v3.0.12. (Important: Calico v2.6.5 was a special transitional release that included changes to enable upgrade to v3.0.1+; do not skip this step!)
    • If you are using the etcd datastore, you should upgrade etcd to the latest stable v3 release.

上述兩條都滿足。

[root@k8s-1 net.d]# etcdctl version
etcdctl version: 3.3.11
API version: 3.3
  • etcd datastore upgrade steps
    • Install and configure calico-upgrade
    • Test the data migration and check for errors
    • Migrate Calico data
    • Upgrade Calico

安裝配置 calico-upgrade

[root@docker-182 ansible]# wget https://github.com/projectcalico/calico-upgrade/releases/download/v1.0.5/calico-upgrade

[root@docker-182 k8s_239]# ansible-playbook install_calico-upgrade.yml 

使用 dry-run 執行測試

[root@k8s-1 calico-upgrade]# calico-upgrade dry-run --output-dir=tmp --apiconfigv1 /etc/calico/apiconfigv1.cfg --apiconfigv3 /etc/calico/apiconfigv3.cfg

執行升級

[root@k8s-1 calico-upgrade]# calico-upgrade start --ignore-v3-data --apiconfigv1 /etc/calico/apiconfigv1.cfg --apiconfigv3 /etc/calico/apiconfigv3.cfg
Preparing reports directory
 * creating report directory if it does not exist
 * validating permissions and removing old reports
Checking Calico version is suitable for migration
 * determined Calico version of: v2.6.12
 * the v1 API data can be migrated to the v3 API
Validating conversion of v1 data to v3
 * handling FelixConfiguration (global) resource
 * handling ClusterInformation (global) resource
 * handling FelixConfiguration (per-node) resources
 * handling BGPConfiguration (global) resource
 * handling Node resources
 * handling BGPPeer (global) resources
 * handling BGPPeer (node) resources
 * handling HostEndpoint resources
 * handling IPPool resources
 * handling GlobalNetworkPolicy resources
 * handling Profile resources
 * handling WorkloadEndpoint resources
 * data conversion successful
Data conversion validated successfully
Validating the v3 datastore
 * the v3 datastore is not empty

-------------------------------------------------------------------------------

Successfully validated v1 to v3 conversion.

You are about to start the migration of Calico v1 data format to Calico v3 data
format. During this time and until the upgrade is completed Calico networking
will be paused - which means no new Calico networked endpoints can be created.
No Calico configuration should be modified using calicoctl during this time.

Type "yes" to proceed (any other input cancels): yes
Pausing Calico networking
 * successfully paused Calico networking in the v1 configuration
Calico networking is now paused - waiting for 15s
Querying current v1 snapshot and converting to v3
 * handling FelixConfiguration (global) resource
 * handling ClusterInformation (global) resource
 * handling FelixConfiguration (per-node) resources
 * handling BGPConfiguration (global) resource
 * handling Node resources
 * handling BGPPeer (global) resources
 * handling BGPPeer (node) resources
 * handling HostEndpoint resources
 * handling IPPool resources
 * handling GlobalNetworkPolicy resources
 * handling Profile resources
 * handling WorkloadEndpoint resources
 * data converted successfully
Storing v3 data
 * Storing resources in v3 format
 * success: resources stored in v3 datastore
Migrating IPAM data
 * listing and converting IPAM allocation blocks
 * listing and converting IPAM affinity blocks
 * listing IPAM handles
 * storing IPAM data in v3 format
 * IPAM data migrated successfully
Data migration from v1 to v3 successful
 * check the output for details of the migrated resources
 * continue by upgrading your calico/node versions to Calico v3.x

-------------------------------------------------------------------------------

Successfully migrated Calico v1 data to v3 format.
Follow the detailed upgrade instructions available in the release documentation
to complete the upgrade. This includes:
 * upgrading your calico/node instances and orchestrator plugins (e.g. CNI) to
   the required v3.x release
 * running 'calico-upgrade complete' to complete the upgrade and resume Calico
   networking

See report(s) below for details of the migrated data.
Reports:
- name conversion: /root/calico-upgrade/calico-upgrade-report/convertednames

下載 v3.0 資源文件

[root@docker-182 v3.0]# wget https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/rbac.yaml

[root@docker-182 v3.0]# wget https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/calico.yaml

3.0的改變可參考 3.0release note

預先下載所需鏡像

[root@docker-182 v3.0]# grep image calico.yaml
          image: quay.io/calico/node:v3.0.12
          image: quay.io/calico/cni:v3.0.12
          image: quay.io/calico/kube-controllers:v3.0.12

執行升級

[root@docker-182 v3.0]# k239 apply -f calico.yaml
configmap "calico-config" configured
secret "calico-etcd-secrets" unchanged
daemonset "calico-node" configured
deployment "calico-kube-controllers" configured
serviceaccount "calico-kube-controllers" unchanged
serviceaccount "calico-node" unchanged

這裏的 pod 可以實現滾動重啓,待pod 都升級完成後。

執行 calico-upgrade 命令確定升級完成

[root@k8s-1 calico-upgrade]# calico-upgrade complete  --apiconfigv1 /etc/calico/apiconfigv1.cfg --apiconfigv3 /etc/calico/apiconfigv3.cfg

You are about to complete the upgrade process to Calico v3. At this point, the
v1 format data should have been successfully converted to v3 format, and all
calico/node instances and orchestrator plugins (e.g. CNI) should be running
Calico v3.x.

Type "yes" to proceed (any other input cancels): yes
Completing upgrade
Enabling Calico networking for v3
 * successfully resumed Calico networking in the v3 configuration (updated
   ClusterInformation)
Upgrade completed successfully

-------------------------------------------------------------------------------

Successfully completed the upgrade process.

如不執行上述命令,會有如下報錯

E1225 19:56:04.837028    3281 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "demo-deployment-6f4c6779b-b8zqq_default(1dd28cf0-270d-11ea-bd6c-c6a864ab864a)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "demo-deployment-6f4c6779b-b8zqq_default" network: Calico is currently not ready to process requests
E1225 19:56:04.837049    3281 kuberuntime_manager.go:647] createPodSandbox for pod "demo-deployment-6f4c6779b-b8zqq_default(1dd28cf0-270d-11ea-bd6c-c6a864ab864a)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "demo-deployment-6f4c6779b-b8zqq_default" network: Calico is currently not ready to process requests
E1225 19:56:04.837167    3281 pod_workers.go:186] Error syncing pod 1dd28cf0-270d-11ea-bd6c-c6a864ab864a ("demo-deployment-6f4c6779b-b8zqq_default(1dd28cf0-270d-11ea-bd6c-c6a864ab864a)"), skipping: failed to "CreatePodSandbox" for "demo-deployment-6f4c6779b-b8zqq_default(1dd28cf0-270d-11ea-bd6c-c6a864ab864a)" with CreatePodSandboxError: "CreatePodSandbox for pod \"demo-deployment-6f4c6779b-b8zqq_default(1dd28cf0-270d-11ea-bd6c-c6a864ab864a)\" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod \"demo-deployment-6f4c6779b-b8zqq_default\" network: Calico is currently not ready to process requests"

升級至 3.0.12 成功。

3.0.12 升級至 3.11

根據 3.11 的 Upgrading Calico on Kubernetes 說明。升級時,只需要提交新的資源文件即可(本環境不涉及 Application Layer Policy)。

這個版本的 calico 已經可以完整支持 k8s api 的datastore, 更新時要注意下載文件時是否與自己的環境契合。

本環境下載 etcd datastore 的版本。

下載資源文件


[root@docker-182 v3.11]# wget https://docs.projectcalico.org/v3.11/manifests/calico-etcd.yaml

# 修改其中關於 etcd 的配置
[root@docker-182 v3.11]# bash -x modify_calico_yaml.sh

預先下載鏡像

[root@docker-182 v3.11]# grep image calico-etcd.yaml
          image: calico/cni:v3.11.1
          image: calico/pod2daemon-flexvol:v3.11.1
          image: calico/node:v3.11.1
          image: calico/kube-controllers:v3.11.1

提交新版本

[root@docker-182 v3.11]# k239 apply -f calico-etcd.yaml
secret "calico-etcd-secrets" unchanged
configmap "calico-config" configured
clusterrole "calico-kube-controllers" configured
clusterrolebinding "calico-kube-controllers" configured
clusterrole "calico-node" configured
clusterrolebinding "calico-node" configured
daemonset "calico-node" configured
serviceaccount "calico-node" unchanged
deployment "calico-kube-controllers" configured
serviceaccount "calico-kube-controllers" unchanged

驗證新版本

查看新版本的 pod, 每個 pod 內只有一個容器,這個版本的將 install-cni 和 flexvol-driver(舊版本沒有) 作爲了 initContainers ,所以常駐的就只有一個容器了

[root@docker-182 ~]# k239 -n kube-system get pod -o wide |grep calico
calico-kube-controllers-85dc4fd46b-4wnmt   1/1       Running   0          1m        10.111.32.243   k8s-4.geotmt.com
calico-node-4bgkc                          1/1       Running   0          59s       10.111.32.241   k8s-2.geotmt.com
calico-node-5jg2t                          1/1       Running   0          31s       10.111.32.244   k8s-5.geotmt.com
calico-node-9fn6r                          1/1       Running   0          43s       10.111.32.245   k8s-6.geotmt.com
calico-node-9n7dn                          1/1       Running   0          1m        10.111.32.243   k8s-4.geotmt.com
calico-node-fxr46                          1/1       Running   0          1m        10.111.32.239   k8s-1.geotmt.com
calico-node-pgh5c                          1/1       Running   0          1m        10.111.32.242   k8s-3.geotmt.com

測試 pod 的跨主機通信

[root@k8s-1 ~]# kubectl exec -it demo-deployment-6f4c6779b-b8zqq /bin/bash
bash-4.4# ping 10.20.235.12
PING 10.20.235.12 (10.20.235.12): 56 data bytes
64 bytes from 10.20.235.12: seq=0 ttl=62 time=1.232 ms
^C
--- 10.20.235.12 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 1.232/1.232/1.232 ms
bash-4.4# ping 10.20.253.80
PING 10.20.253.80 (10.20.253.80): 56 data bytes
64 bytes from 10.20.253.80: seq=0 ttl=62 time=1.730 ms
64 bytes from 10.20.253.80: seq=1 ttl=62 time=1.385 ms
^C
--- 10.20.253.80 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 1.385/1.557/1.730 ms
bash-4.4# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if51: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether fa:d1:55:42:ab:6c brd ff:ff:ff:ff:ff:ff
    inet 10.20.15.163/32 scope global eth0
       valid_lft forever preferred_lft forever

測試pod重建分配地址,成功

[root@k8s-1 ~]# kubectl delete pod nginx-deployment-7b66d98974-2rh87
pod "nginx-deployment-7b66d98974-2rh87" deleted

[root@k8s-1 ~]# kubectl get pod nginx-deployment-7b66d98974-nd8h7 -o wide 
NAME                                READY     STATUS    RESTARTS   AGE       IP             NODE
nginx-deployment-7b66d98974-nd8h7   1/1       Running   0          1m        10.20.253.86   k8s-4.geotmt.com

calico 3.0.12 升級至 3.11.1 成功。

參考

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章