準備
在開始部署 k8s 高可用集羣時,請先參考k8s高可用環境部署系統準備
操作系統兼容性
環境說明
集羣部署前系統環境裝備,請參考k8s高可用環境部署系統準備.md
本次高可用集羣基本參照官網步驟進行部署,官網給出了兩種拓撲結構:堆疊control plane node和external etcd node,本文基於第一種拓撲結構進行部署,使用Keepalived + HAProxy
搭建高可用Load balancer,完整的拓撲圖如下:
單個mastre節點將部署keepalived、haproxy、etcd、apiserver、controller-manager、schedule六種服務,load balancer集羣和etcd集羣僅用來爲kubernetes集羣集羣服務,不對外營業。如果必要,可以將load balancer或者etcd單獨部署,爲kubernetes集羣提供服務的同時,也可以爲其他有需要的系統提供服務,比如下面這樣的拓撲結構:
說明⚠️:這種拓撲結構也對應external etcd node~
本文僅部署master節點,使用kubeadm部署worker節點非常簡單,不在贅述,環境清單:
服務器 主機IP 主機名字 功能
k8s-master01 192.168.246.193 master01 master+etcd+keepalived+HaProxy
k8s-master02 192.168.246.194 master02 master+etcd+keepalived+HaProxy
k8s-master03 192.168.246.195 master03 master+etcd+keepalived+HaProxy
鏡像清單:
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy v1.17.3 ae853e93800d 4 weeks ago 116MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager v1.17.3 b0f1517c1f4b 4 weeks ago 161MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver v1.17.3 90d27391b780 4 weeks ago 171MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler v1.17.3 d109c0821a2b 4 weeks ago 94.4MB
registry.cn-hangzhou.aliyuncs.com/google_containers/coredns 1.6.5 70f311871ae1 4 months ago 41.6MB
calico/node v3.9.2 14a380c92c40 5 months ago 195MB
calico/cni v3.9.2 c0d73dd53e71 5 months ago 160MB
calico/kube-controllers v3.9.2 7f7ed50db9fb 5 months ago 56MB
calico/pod2daemon-flexvol v3.9.2 523f0356e07b 5 months ago 9.78MB
registry.cn-hangzhou.aliyuncs.com/google_containers/pause 3.1 da86e6ba6ca1 2 years ago 742kB
主要軟件清單:
部署步驟
我們在這裏使用外部 etcd 節點,這裏 etcd 所在的節點就是master01和master02。
-
新版的k8s,etcd節點已經可以完美和master節點共存於同一臺服務器上;
-
etcd有3種方式安裝(獨立安裝、docker方式、k8s內部集成);
- 運行的 etcd 集羣個數成員爲奇數;
建立安全的 etcd 集羣
(1)下載證書生成工具
#etcd三臺機器安裝創建證書所需軟件
curl -o /usr/local/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
curl -o /usr/local/bin/cfssljson https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
curl -o /usr/local/bin/cfssl-certinfo https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
#cfssl授權
chmod +x /usr/local/bin/cfssl*
(2)創建CA
#以下操作在 etcd1 機器執行
mkdir -p /etc/kubernetes/pki/etcd
cd /etc/kubernetes/pki/etcd
#創建 CA 配置文件(ca-config.json)
#我們可以創建一個初始的ca-config.json文件,如:cfssl print-defaults config > ca-config.json,然後對其進行修改。
cat >ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "876000h"
},
"profiles": {
"etcd": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "876000h"
}
}
}
}
EOF
#對上面的字段進行說明
"ca-config.json":可以定義多個 profiles,分別指定不同的過期時間、使用場景等參數;後續在簽名證書時使用某個 profile;
"signing":表示該證書可用於簽名其它證書;生成的 ca.pem 證書中 CA=TRUE;
"server auth":表示client可以用該 CA 對server提供的證書進行驗證;
"client auth":表示server可以用該CA對client提供的證書進行驗證;
#創建 CA 證書籤名請求(ca-csr.json)
cat >ca-csr.json <<EOF
{
"CN": "etcd",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "shanghai",
"L": "shanghai",
"O": "etcd",
"OU": "System"
}
]
}
EOF
#對上面的字段進行說明
"CN":Common Name,etcd 從證書中提取該字段作爲請求的用戶名 (User Name);瀏覽器使用該字段驗證網站是否合法;
"O":Organization,etcd 從證書中提取該字段作爲請求用戶所屬的組 (Group);
這兩個參數在後面的kubernetes啓用RBAC模式中很重要,因爲需要設置kubelet、admin等角色權限,那麼在配置證書的時候就必須配置對了,具體後面在部署kubernetes的時候會進行講解。
"在etcd這兩個參數沒太大的重要意義,跟着配置就好。"
#生成 CA 證書和私鑰
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
#證書文件說明
* 生成 "ca.csr ca-key.pem ca.pem" 三個文件
* ca.pem 根證書公鑰文件
* ca-key.pem 根證書私鑰文件
* ca.csr 證書籤名請求,用於交叉簽名或重新簽名
* ca-config.json 使用cfssl工具生成其他類型證書需要引用的配置文件
* ca.pem用於簽發後續其他的證書文件,因此ca.pem文件需要分發到集羣中的每臺服務器上去
(3)創建etcd證書
#創建etcd的TLS認證證書
#創建 etcd證書籤名請求(etcd-csr.json)
cd /etc/kubernetes/pki/etcd
cat > etcd-csr.json <<EOF
{
"CN": "etcd",
"hosts": [
"192.168.246.193",
"192.168.246.194",
"192.168.246.195",
"master01",
"master02",
"master03"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "shanghai",
"L": "shanghai",
"O": "etcd",
"OU": "System"
}
]
}
EOF
#生成 etcd證書和私鑰
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd etcd-csr.json | cfssljson -bare etcd
(4)etcd免密認證和證書拷貝
#etcd三臺機器執行
#三臺機器免密認證
ssh-copy-id root@<etcd1-ip-address>
ssh-copy-id root@<etcd2-ip-address>
ssh-copy-id root@<etcd3-ip-address>
#etcd2&etcd3執行
mkdir -p /etc/kubernetes/pki/etcd
cd /etc/kubernetes/pki/etcd
scp [email protected]:/etc/kubernetes/pki/etcd/ca.pem .
scp [email protected]:/etc/kubernetes/pki/etcd/ca-key.pem .
scp [email protected]:/etc/kubernetes/pki/etcd/etcd.pem .
scp [email protected]:/etc/kubernetes/pki/etcd/etcd-key.pem .
scp [email protected]:/etc/kubernetes/pki/etcd/ca-config.json .
(5)etcd集羣部署
#etcd三臺機器安裝etcd可執行文件
mkdir -p /data/sys/var/etcd
chmod -R 777 /data/sys/var/etcd
ln -s /data/sys/var/etcd /var/lib/etcd
export ETCD_VERSION=v3.4.4
curl -sSL https://github.com/coreos/etcd/releases/download/${ETCD_VERSION}/etcd-${ETCD_VERSION}-linux-amd64.tar.gz | tar -xzv --strip-components=1 -C /usr/local/bin/
#配置etcd三臺機器執行
#創建etcd環境配置文件
touch /etc/etcd.env
echo "PEER_NAME=master01" >> /etc/etcd.env #另外兩臺就是master02/03
echo "PRIVATE_IP=192.168.246.193" >> /etc/etcd.env #另外兩臺就是192.168.246.194/195
cat /etc/systemd/system/etcd.service
[Unit]
Description=etcd
Documentation=https://github.com/coreos/etcd
Conflicts=etcd.service
Conflicts=etcd2.service
[Service]
EnvironmentFile=/etc/etcd.env
Type=notify
Restart=always
RestartSec=5s
LimitNOFILE=40000
TimeoutStartSec=0
ExecStart=/usr/local/bin/etcd --name ${PEER_NAME} \
--data-dir /var/lib/etcd \
--listen-client-urls https://${PRIVATE_IP}:2379 \
--advertise-client-urls https://${PRIVATE_IP}:2379 \
--listen-peer-urls https://${PRIVATE_IP}:2380 \
--initial-advertise-peer-urls https://${PRIVATE_IP}:2380 \
--cert-file=/etc/kubernetes/pki/etcd/etcd.pem \
--key-file=/etc/kubernetes/pki/etcd/etcd-key.pem \
--client-cert-auth \
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
--peer-cert-file=/etc/kubernetes/pki/etcd/etcd.pem \
--peer-key-file=/etc/kubernetes/pki/etcd/etcd-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
--initial-cluster <etcd1>=https://<etcd1-ip-address>:2380,<etcd2>=https://<etcd2-ip-address>:2380,<etcd3>=https://<etcd3-ip-address>:2380 \
--initial-cluster-token my-etcd-token \
--initial-cluster-state new
[Install]
WantedBy=multi-user.target
說明:
* 將<etcd1><etcd2><etcd3>改爲對應節點的hostname
* 將<etcd1-ip-address><etcd2-ip-address><etcd3-ip-address>改爲對應節點的通訊ip
#啓動etcd集羣
systemctl daemon-reload
systemctl start etcd
systemctl enable etcd
systemctl status etcd -l
#etcd集羣服務的信息
mkdir /etc/kubernetes/scripts
cd /etc/kubernetes/scripts
cat etcd.sh
HOST_1=192.168.246.193
HOST_2=192.168.246.194
HOST_3=192.168.246.195
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379
#etcd集羣健康信息
etcdctl --endpoints=$ENDPOINTS --cacert=/etc/kubernetes/pki/etcd/ca.pem --cert=/etc/kubernetes/pki/etcd/etcd.pem --key=/etc/kubernetes/pki/etcd/etcd-key.pem endpoint health
#etcd集羣狀態信息
etcdctl --endpoints=$ENDPOINTS --cacert=/etc/kubernetes/pki/etcd/ca.pem --cert=/etc/kubernetes/pki/etcd/etcd.pem --key=/etc/kubernetes/pki/etcd/etcd-key.pem --write-out=table endpoint status
#etcd集羣成員信息
etcdctl --endpoints=$ENDPOINTS --cacert=/etc/kubernetes/pki/etcd/ca.pem --cert=/etc/kubernetes/pki/etcd/etcd.pem --key=/etc/kubernetes/pki/etcd/etcd-key.pem member list -w table
#執行上面腳本打印如下sh etcd.sh
192.168.246.193:2379 is healthy: successfully committed proposal: took = 18.14859ms
192.168.246.194:2379 is healthy: successfully committed proposal: took = 23.323287ms
192.168.246.195:2379 is healthy: successfully committed proposal: took = 26.20336ms
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.246.193:2379 | eff04b7e9f6dffe1 | 3.4.4 | 29 kB | false | false | 15 | 15 | 15 | |
| 192.168.246.194:2379 | 5f2f927b4eb48281 | 3.4.4 | 25 kB | true | false | 15 | 15 | 15 | |
| 192.168.246.195:2379 | 93be7c874982c2c6 | 3.4.4 | 25 kB | false | false | 15 | 15 | 15 | |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+------------------+---------+----------+------------------------------+------------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+----------+------------------------------+------------------------------+------------+
| 5f2f927b4eb48281 | started | master02 | https://192.168.246.194:2380 | https://192.168.246.194:2379 | false |
| 93be7c874982c2c6 | started | master03 | https://192.168.246.195:2380 | https://192.168.246.195:2379 | false |
| eff04b7e9f6dffe1 | started | master01 | https://192.168.246.193:2380 | https://192.168.246.193:2379 | false |
+------------------+---------+----------+------------------------------+------------------------------+------------+
部署高可用負載均衡集羣
部署keepalived
此處的keeplived的主要作用是爲haproxy提供vip(192.168.246.200),在三個haproxy實例之間提供主備,降低當其中一個haproxy失效的時對服務的影響。主要步驟如下:
三臺master機器均是如下操作!
(1)安裝keepalived
yum install -y keepalived
(2)配置keepalived
cd /etc/keepalived
mv keepalived.conf keepalived.conf_bak
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_haproxy {
script "killall -0 haproxy"
interval 3
weight -2
fall 10
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 250 #優先級保持唯一,這裏master01爲250,master02爲200,master03爲150
advert_int 1
authentication {
auth_type PASS
auth_pass 35f18af7190d51c9f7f78f37300a0cbd
}
virtual_ipaddress {
192.168.246.200
}
track_script {
check_haproxy
}
}
EOF
#上面配置文件說明
*記得修改上面配置文件priority
* killall -0 根據進程名稱檢測進程是否存活
* master01節點爲***MASTER***,其餘節點爲***BACKUP***
* priority各個幾點到優先級相差50,範圍:0~250(非強制要求),數值越大優先級越高~
(3)啓動並檢測服務
systemctl enable keepalived.service
systemctl start keepalived.service
systemctl status keepalived.service
#我們在master01主節點上,看下ip信息
ip address show ens33
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:66:1a:10 brd ff:ff:ff:ff:ff:ff
inet 192.168.246.193/24 brd 192.168.246.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.246.200/32 scope global ens33
valid_lft forever preferred_lft forever
部署haproxy
此處的haproxy爲apiserver提供反向代理,haproxy將所有請求輪詢轉發到每個master節點上。相對於僅僅使用keepalived主備模式僅單個master節點承載流量,這種方式更加合理、健壯。
三臺機器均是如下步驟:
(1)安裝HaProxy
yum install -y haproxy
(2)配置haproxy
cd /etc/haproxy
mv haproxy.cfg haproxy.cfg_bak
cat > /etc/haproxy/haproxy.cfg << EOF
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# kubernetes apiserver frontend which proxys to the backends
#---------------------------------------------------------------------
frontend kubernetes-apiserver
mode tcp
bind *:16443
option tcplog
default_backend kubernetes-apiserver
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend kubernetes-apiserver
mode tcp
balance roundrobin
server master01 192.168.246.193:6443 check #更改對應的主機名和IP
server master02 192.168.246.194:6443 check #更改對應的主機名和IP
server master03 192.168.246.195:6443 check #更改對應的主機名和IP
#---------------------------------------------------------------------
# collection haproxy statistics message
#---------------------------------------------------------------------
listen stats
bind *:1080
stats auth admin:awesomePassword
stats refresh 5s
stats realm HAProxy\ Statistics
stats uri /admin?stats
EOF
#說明:
* 所有master節點上的配置完全相同
* haproxy日誌配置方法和細節可參考[HaProxy安裝和常用命令](https://blog.51cto.com/wutengfei/2467351)
(3)啓動並檢測服務
systemctl enable haproxy.service
systemctl start haproxy.service
systemctl status haproxy.service
ss -lnt | grep -E "16443|1080"
LISTEN 0 128 *:1080 *:*
LISTEN 0 128 *:16443 *:*
安裝kubeadm、kubectl、kubelet
三臺機器均是如下操作:
(1)設置kubernetes的yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum clean all
yum makecache fast
(2)安裝kubelet kubeadm kubectl
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
(3)啓動kubelet.service並設置開機啓動
systemctl enable kubelet.service
systemctl start kubelet.service
systemctl status kubelet.service
說明:這時如果狀態是“loaded”,暫時可以不用管~當然我們也可以排查下爲什麼kubelet是loaded狀態:
執行 journalctl -xefu kubelet 發現“error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory”。kubeadm init 初始化信息後,我們看一下初始化過程發現自動創建了 "/var/lib/kubelet/config.yaml" 這個文件。
(4)編輯hosts文件,添加如下內容
cat /etc/hosts
192.168.246.200 cluster.kube.com
192.168.246.193 master01
192.168.246.194 master02
192.168.246.195 master03
初始化第一個master節點
以下操作在master01節點進行:
(1)編輯kubeadm配置文件
mkdir -p /etc/kubernetes/my-conf
cd /etc/kubernetes/my-conf
cat >config.yaml <<EOF
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: 1.17.3
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
etcd:
external:
endpoints:
- https://192.168.246.193:2379
- https://192.168.246.194:2379
- https://192.168.246.195:2379
caFile: /etc/kubernetes/pki/etcd/ca.pem
certFile: /etc/kubernetes/pki/etcd/etcd.pem
keyFile: /etc/kubernetes/pki/etcd/etcd-key.pem
networking:
podSubnet: 10.244.0.0/16
apiServer:
certSANs:
- "cluster.kube.com"
controlPlaneEndpoint: "cluster.kube.com:16443"
EOF
(2)啓動集羣,獲得返回命令用來加入集羣
kubeadm init --config=config.yaml
注意下面初始化成功之後的信息,如下:
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join cluster.kube.com:16443 --token 6uxfh3.urwz7noyhnvee4iz \
--discovery-token-ca-cert-hash sha256:6a06960763e5b2a7689b1f936a438e4fb369c0eab1d1f49964a166ec02966c57 \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join cluster.kube.com:16443 --token 6uxfh3.urwz7noyhnvee4iz \
--discovery-token-ca-cert-hash sha256:6a06960763e5b2a7689b1f936a438e4fb369c0eab1d1f49964a166ec02966c57
(3)認證linux用戶操作權限
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
(4)查看節點
kubectl get node
NAME STATUS ROLES AGE VERSION
master01 NotReady master 9m51s v1.17.3
(5)查看集羣狀態
kubectl get cs
NAME STATUS MESSAGE ERROR
etcd-2 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
(6)動態查看 kube-system 命名空間下的pod
kubectl get pod -n kube-system -o wide -w 或 watch kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-7f9c544f75-wsjxq 0/1 Pending 0 11m <none> <none> <none> <none>
coredns-7f9c544f75-xvhjc 0/1 Pending 0 11m <none> <none> <none> <none>
kube-apiserver-master01 1/1 Running 0 11m 192.168.246.193 master01 <none> <none>
kube-controller-manager-master01 1/1 Running 0 11m 192.168.246.193 master01 <none> <none>
kube-proxy-fr84w 1/1 Running 0 11m 192.168.246.193 master01 <none> <none>
kube-scheduler-master01 1/1 Running 0 11m 192.168.246.193 master01 <none> <none>
(7)執行命令查看kubeadmin的配置
kubeadm config view 結果如下:
apiServer:
certSANs:
- cluster.kube.com
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: cluster.kube.com:16443
controllerManager: {}
dns:
type: CoreDNS
etcd:
external:
caFile: /etc/kubernetes/pki/etcd/ca.pem
certFile: /etc/kubernetes/pki/etcd/etcd.pem
endpoints:
- https://192.168.246.193:2379
- https://192.168.246.194:2379
- https://192.168.246.195:2379
keyFile: /etc/kubernetes/pki/etcd/etcd-key.pem
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.17.3
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
master02&master03執行
scp [email protected]:/etc/kubernetes/pki/ca.crt /etc/kubernetes/pki/
scp [email protected]:/etc/kubernetes/pki/ca.key /etc/kubernetes/pki/
scp [email protected]:/etc/kubernetes/pki/sa.key /etc/kubernetes/pki/
scp [email protected]:/etc/kubernetes/pki/sa.pub /etc/kubernetes/pki/
kubeadm join cluster.kube.com:16443 --token 6uxfh3.urwz7noyhnvee4iz \
--discovery-token-ca-cert-hash sha256:6a06960763e5b2a7689b1f936a438e4fb369c0eab1d1f49964a166ec02966c57 \
--control-plane
部署網絡插件
在 master01 節點部署插件,網絡插件有兩種,選擇其一即可。calico和flannel,本次使用的calico網絡插件:
(1)使用calico網絡插件
mkdir -p /etc/kubernetes/manifests/my.conf/network-utils
curl -o /etc/kubernetes/manifests/my.conf/network-utils https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
curl -o /etc/kubernetes/manifests/my.conf/network-utils https://kuboard.cn/install-script/calico/calico-3.9.2.yaml
kubectl apply -f /etc/kubernetes/manifests/my.conf/network-utils/rbac-kdd.yaml
kubeadm config view #獲取podSubnet
export POD_SUBNET=10.244.0.0/16
sed -i "s#192\.168\.0\.0/16#${POD_SUBNET}#" calico-3.9.2.yaml
kubectl apply -f /etc/kubernetes/manifests/my.conf/network-utils/calico-3.9.2.yaml
(2)使用flannel網絡插件
curl -o /etc/kubernetes/manifests/my.conf/network-utils https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f /etc/kubernetes/manifests/my.conf/network-utils/kube-flannel.yml
#查看集羣節點狀態
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master01 Ready master 16h v1.17.3 192.168.246.193 <none> CentOS Linux 7 (Core) 3.10.0-1062.el7.x86_64 docker://19.3.8
master02 Ready master 16h v1.17.3 192.168.246.194 <none> CentOS Linux 7 (Core) 3.10.0-1062.el7.x86_64 docker://19.3.8
master03 Ready master 16h v1.17.3 192.168.246.195 <none> CentOS Linux 7 (Core) 3.10.0-1062.el7.x86_64 docker://19.3.8
#查看狀態
kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-dc6cb64cb-492mv 1/1 Running 2 166m 10.244.241.70 master01 <none> <none>
calico-node-74zf2 1/1 Running 2 166m 192.168.246.193 master01 <none> <none>
calico-node-krmmj 1/1 Running 3 166m 192.168.246.195 master03 <none> <none>
calico-node-l5k2p 1/1 Running 2 166m 192.168.246.194 master02 <none> <none>
coredns-7f9c544f75-9h92k 1/1 Running 2 16h 10.244.241.71 master01 <none> <none>
coredns-7f9c544f75-rn4fj 1/1 Running 2 16h 10.244.241.72 master01 <none> <none>
kube-apiserver-master01 1/1 Running 34 16h 192.168.246.193 master01 <none> <none>
kube-apiserver-master02 1/1 Running 14 15h 192.168.246.194 master02 <none> <none>
kube-apiserver-master03 1/1 Running 15 15h 192.168.246.195 master03 <none> <none>
kube-controller-manager-master01 1/1 Running 36 16h 192.168.246.193 master01 <none> <none>
kube-controller-manager-master02 1/1 Running 22 15h 192.168.246.194 master02 <none> <none>
kube-controller-manager-master03 1/1 Running 16 15h 192.168.246.195 master03 <none> <none>
kube-proxy-5xmv8 1/1 Running 7 15h 192.168.246.195 master03 <none> <none>
kube-proxy-pfslb 1/1 Running 5 16h 192.168.246.194 master02 <none> <none>
kube-proxy-pxdsn 1/1 Running 4 16h 192.168.246.193 master01 <none> <none>
kube-scheduler-master01 1/1 Running 41 16h 192.168.246.193 master01 <none> <none>
kube-scheduler-master02 1/1 Running 17 15h 192.168.246.194 master02 <none> <none>
kube-scheduler-master03 1/1 Running 15 15h 192.168.246.195 master03 <none> <none>
到這裏 k8s 高可用master部分就已經部署好了~
遇到的問題處理
k8s集羣使用caliico遇到的問題,報錯內容如:“Readiness probe failed: caliconode is not ready: BIRD is not ready: BGP not established with”
如果遇到上面報錯,這裏提供一個處理方法:
#下載和安裝calicoctl工具,注意calico版本
cd /usr/local/bin
curl -O -L https://github.com/projectcalico/calicoctl/releases/download/v3.9.2/calicoctl
chmod +x calicoctl
#編輯配置文件/etc/calico/calicoctl.cfg
mkdir /etc/calico
因爲我們使用的是內部etcd集羣,所以需要對calicoctl進行配置,使其能讀取calico配置信息。
cat > /etc/calico/calicoctl.cfg << EOF
apiVersion: projectcalico.org/v3
kind: CalicoAPIConfig
metadata:
spec:
datastoreType: "kubernetes"
kubeconfig: "/root/.kube/config"
EOF
如果你使用的是外部etcd集羣,這裏提供個模版可參考:
apiVersion: projectcalico.org/v3
kind: CalicoAPIConfig
metadata:
spec:
datastoreType: "etcd"
etcdEndpoints: "https://192.168.246.193:2379,https://192.168.246.194:2379,https://192.168.246.195:2379"
etcdKeyFile: "/etc/kubernetes/pki/etcd/etcd-key.pem"
etcdCertFile: "/etc/kubernetes/pki/etcd/etcd.pem"
etcdCACertFile: "/etc/kubernetes/pki/etcd/ca.pem"
#calicoctl常用命令
(1)calicoctl get node #查看網絡節點
NAME
master01
master02
master03
(2)calicoctl node status #節點網絡狀態
Calico process is running.
IPv4 BGP status
+-----------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+-----------------+-------------------+-------+----------+-------------+
| 192.168.246.193 | node-to-node mesh | up | 05:08:38 | Established |
| 192.168.246.194 | node-to-node mesh | up | 05:08:38 | Established |
+-----------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
#使用calicoctl工具查看有問題的節點
calicoctl node status #有問題的節點狀態
Calico process is running.
IPv4 BGP status
+-----------------+-------------------+-------+----------+---------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+-----------------+-------------------+-------+----------+---------+
| 192.168.246.194 | node-to-node mesh | start | 03:29:06 | Passive |
+-----------------+-------------------+-------+----------+---------+
IPv6 BGP status
No IPv6 peers found.
#使用calicoctl工具來對calico進行更改
(1)查看問題節點的yaml文件
calicoctl get node master03 -o yaml
(2)calicoctl get node master03 -o yaml > calicomaster03.yaml
vim calicomaster03.yaml
將ip更改正確
calicoctl apply -f calicomaster03.yaml
kubectl get po -n kube-system
可以看到calico-node的節點都正常啓動
忘記 join token
master01創建集羣時的返回命令貼入從機命令行執行,如果忘記可從master01重新獲取
獲取方法:在master01上執行
kubeadm token create --print-join-command
說明:默認情況下,通過kubeadm create token
創建的 token ,過期時間是24小時。可以運行 kubeadm token create --ttl 0生成一個永不過期的 token,參考文檔kubeadm token。
參考文檔
(3)使用Kubeadm + HAProxy + Keepalived部署高可用Kubernetes集羣
(4)Keepalived+Haproxy實現高可用負載綜合實驗、
(6)k8s教程