简体   繁体   中英

after upgrading to kubelet v1.24 clusted does not start

After apt update && apt upgrade kubelet is no longer starting up. In journalctl it's printing a kubelet 's helptext and complaining about unsupported --network-plugin flag.

Looks like after upgrading to kubelet 1.24.0 the cluster broke down.

root@netikras-hub:/etc/systemd/system/kubelet.service.d# kubelet --version
Kubernetes v1.24.0
root@netikras-hub:/etc/systemd/system/kubelet.service.d# kubelet --help | grep network-plugin
root@netikras-hub:/etc/systemd/system/kubelet.service.d# 
root@netikras-hub:/etc/systemd/system/kubelet.service.d# kubelet --network-plugin=cni 2>&1 | head -3
Error: failed to parse kubelet flag: unknown flag: --network-plugin
Usage:
  kubelet [flags]

while it seems to be working on 1.20.4

[root@CentOS-83-64-minimal ~]# kubelet --version
Kubernetes v1.20.4
[root@CentOS-83-64-minimal ~]# kubelet --help | grep network-plugin
      --network-plugin string                                    The name of the network plugin to be invoked for various events in kubelet/pod lifecycle. This docker-specific flag only works when container-runtime is set to docker.
      --network-plugin-mtu int32                                 The MTU to be passed to the network plugin, to override the default. Set to 0 to use the default 1460 MTU. This docker-specific flag only works when container-runtime is set to docker.
[root@CentOS-83-64-minimal ~]# 

I found that v1.24 still refers to the netwok-plugin flag and raised a GL issue to update the docs in this ticket . However, folks there are keen on updating the docs only, and not guiding through my cluster recovery options.

What is the easiest way to recover? I'm using flannel as my CNI.

My understanding is that after the dockershim removal, all container runtimes are CNI-aware so I would expect them to use the standard /etc/cni/net.d/ mechanism for identifying the CNI plugin without needing the previous hints.

If you have a correct /etc/cni/net.d/nn-provider.conflist and the binaries in /opt/cni/bin you can just remove the faulting kubelet flags and it 'should just work'.

If this doesn't work I would suggest having a look at your flannel daemonset manifest and see what it thinks the location of the CNI bindir is.

作为在 kubeadm 中解决此问题之前的解决方法,您可以通过在 kubeadm之后运行以下命令来删除与网络相关的标志:

echo "KUBELET_NETWORK_ARGS=''" | sudo tee --append /var/lib/kubelet/kubeadm-flags.env

kubelet demands new container runtime bcs of deprecation of docker. For solution:

  1. cat <<EOF | sudo tee /var/lib/kubelet/kubeadm-flags.env KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --pod-infra-container-image=k8s.gcr.io/pause:3.7"

  2. systemctl daemon-reload && systemctl restart kubelet

not: you can disable docker. if any different config applied for docker, you should also configure containerd from /etc/containerd/config.toml

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM