1 升级思路
备份集群 yaml 文件,etcd 数据。
先升级kubeadm版本
升级第一个主控制平面节点Master组件
升级第一个主控制平面节点上的Kubelet和kubectl
升级其他控制平面节点
升级Node节点
验证集群
2 注意事项
不可以进行跨大版本升级的,只能一个版本一个版本的升级。
升级节点前建议将节点标记维护并驱逐 pod。请参考:安全地清空一个节点 | Kubernetes
必须禁用交换分区。
3 升级步骤
准备工作
在master节点上查看此时的kubernetes的版本
[root@master1 ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.6", GitCommit:"741c8db18a52787d734cbe4795f0b4ad860906d6", GitTreeState:"clean", BuildDate:"2023-09-13T09:19:54Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"}查询最新版本号
[root@master1 ~]# dnf list kubeadm --showduplicates | sort -r
升级操作
master和node节点执行升级命令
# rhel机器执行
[root@master1 ~]# dnf update -y kubelet-1.21.9 kubeadm-1.21.9 kubectl-1.21.9
# ubuntu机器执行
[root@master1 ~]# apt-get upgrade -y kubelet=1.21.9-1.1 kubeadm=1.21.9-1.1 kubectl=1.21.9-1.1升级到指定版本(所有 master节点执行)
kubeadm upgrade apply v1.21.9

master和node重启kubelet
systemctl daemon-reload
systemctl restart kubelet验证
查看node信息

4 升级常见问题
1.19->1.20
现象:从1.19.X升级到1.20.X时,kube-proxy无法启动,日志报错如下
F1029 01:23:10.186401 1 server.go:488] failed complete: unrecognized feature gate: SupportIPVSProxyMode原因:
kube-proxy无法识别SupportIPVSProxyMode这个字段,于是访问官方查看最新版本ipvs开启的正确配置,通过https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/ipvs/README.md可以看到官方说明:
kubeProxy:
config:
mode: ipvs处理办法:
# 查看kube-proxy配置文件名称
[root@k8s-master ~]# kubectl get cm -n kube-system
NAME DATA AGE
coredns 1 36d
extension-apiserver-authentication 6 36d
kube-proxy 2 36d
kube-root-ca.crt 1 11h
kubeadm-config 2 36d
kubelet-config-1.19 1 36d
kubelet-config-1.20 1 11h
# 编辑configmap
[root@k8s-master ~]# kubectl edit cm -n kube-system kube-proxy
enableProfiling: false # 注释下面两行内容
#featureGates:
# SupportIPVSProxyMode: true
healthzBindAddress: ""1.23->1.24
升级后节点NotReady,查看kubelet日志有以下报错:
kubelet[6970]: Error: failed to parse kubelet flag: unknown flag: --network-plugin
修改如下配置文件,去掉–network-plugin
vim /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.6"
然后重启kubelet
systemctl restart kubelet
1.26->1.27
升级后节点NotReady,查看kubelet日志有以下报错:
kubelet[13394]: E0806 18:29:05.809154 13394 run.go:74] "command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"
修改如下配置文件,去掉–container-runtime
vim /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--container-runtime-endpoint=unix:///run/containerd/containerd.sock"
然后重启kubelet
systemctl restart kubelet
1.27->1.28
执行kubeadm upgrade apply 1.28.2命令时有以下报错
[preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2: output: time="2024-08-06T21:12:52+08:00" level=fatal msg="validate service connection: validate CRI v1 image API for endpoint \"unix:///var/run/dockershim.sock\": rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory\""
修改节点runtime配置
kubectl edit nodes k8s-master
apiVersion: v1
kind: Node
metadata:
annotations:
flannel.alpha.coreos.com/backend-data: '{"VNI":1,"VtepMAC":"1a:62:68:ab:7b:7e"}'
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: "true"
flannel.alpha.coreos.com/public-ip: 192.168.10.10
kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock # 修改为containerd.sock
5 修改runtime为containerd
Kubernetes 从 v1.24 开始全面弃用 Docker,并推荐用户切换到基于容器运行时接口(CRI)的容器引擎,如 containerd、cri-o 等。如果想要升级到1.24以上的版本,需要先将runtime从docker替换为container。
维护节点
首先将需要修改的节点设置成不可调度
[root@k8s-master ~]# kubectl cordon k8s-work1
node/k8s-work2 cordoned
[root@k8s-master ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane,master 40d v1.23.17
k8s-work1 Ready,SchedulingDisabled <none> 40d v1.23.17
k8s-work2 Ready <none> 40d v1.23.17
驱逐该节点上除了daemonset的pod
[root@k8s-master ~]# kubectl drain k8s-work1 --ignore-daemonsets --delete-emptydir-data
切换Runtime
停止docker、containerd 和 kubelet:
[root@k8s-work1 ~]# systemctl stop kubelet
[root@k8s-work1 ~]# systemctl stop docker
Warning: Stopping docker.service, but it can still be activated by:
docker.socket
[root@k8s-work1 ~]# systemctl stop containerd
卸载docker
[root@k8s-work1 ~]# dnf -y remove docker-ce docker-ce-cli containerd.io
安装并配置containerd
[root@k8s-work1 ~]# containerd -v
containerd containerd.io 1.6.4 212e8b6fa2f44b9c21b2798135fc6fb7c53efc16
配置kubelet
修改 kubelet 配置,将容器运行时配置为 containerd,编辑/etc/sysconfig/kubelet 文件,在该文件中可以添加kubelet 启动参数:
[root@k8s-work1 ~]# vim /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--container-runtime=remote --container-runtime-endpoint=unix:///run/containerd/containerd.sock"–container-runtime :指定使用的容器运行时的,可选值为 docker 或者 remote,默认是 docker,除 docker 之外的容器运行时都应该指定为 remote。
–container-runtime-endpoint:是用来指定远程的运行时服务的 endpiont 地址的,在 Linux 系统中一般都是使用 unix 套接字的形式,unix:///run/containerd/containerd.sock。
配置完成后重启 containerd 和 kubelet 即可:
[root@k8s-work1 ~]# systemctl daemon-reload
[root@k8s-work1 ~]# systemctl restart containerd
[root@k8s-work1 ~]# systemctl restart kubelet查看节点信息
[root@k8s-master ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-master Ready control-plane,master 40d v1.23.17 192.168.1.80 <none> Rocky Linux 8.10 (Green Obsidian) 4.18.0-553.5.1.el8_10.x86_64 docker://20.10.7
k8s-work1 Ready,SchedulingDisabled <none> 40d v1.23.17 192.168.1.81 <none> Rocky Linux 8.10 (Green Obsidian) 4.18.0-553.5.1.el8_10.x86_64 containerd://1.6.4
k8s-work2 Ready <none> 40d v1.23.17 192.168.1.82 <none> Rocky Linux 8.10 (Green Obsidian) 4.18.0-553.5.1.el8_10.x86_64 docker://20.10.7
允许调度
[root@c2-master ~]# kubectl uncordon k8s-work1
node/k8s-work1 uncordoned
评论区