简体   繁体   English

无法加载 kubelet 配置文件

[英]failed to load kubelet config file

hy folks嘿伙计们

after updating my server, I can't restart kubernetes.更新我的服务器后,我无法重新启动 kubernetes。

Feb  6 10:34:26 chgvas99 kubelet: F0206 10:34:26.662744   27634 server.go:189] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory
Feb  6 10:34:26 chgvas99 systemd: kubelet.service: main process exited, code=exited, status=255/n/a
Feb  6 10:34:26 chgvas99 systemd: Unit kubelet.service entered failed state.
Feb  6 10:34:26 chgvas99 systemd: kubelet.service failed.

i checked on the directory and indeed there is no config.yaml i've the same error on my nodes i cant restart them我检查了目录,确实没有config.yaml我的节点上也有同样的错误,我无法重新启动它们

server : 3.10.0-957.5.1.el7.x86_64服务器: 3.10.0-957.5.1.el7.x86_64

kubernetes : Major:"1", Minor:"13", GitVersion:"v1.13.3" GoVersion:"go1.11.5" kubernetes : Major:"1", Minor:"13", GitVersion:"v1.13.3" GoVersion:"go1.11.5"

I would recommend runninng 'kubeadm-init' to reinitialise the cluster.我建议运行 'kubeadm-init' 来重新初始化集群。 Also please make sure you '/var' directory is not full.另外请确保您的“/var”目录未满。 Please see this link for more information about 'kubeadm init' command.有关“kubeadm init”命令的更多信息,请参阅此链接

You're using - So The fact that /var/lib/kubelet/config.yaml is empty is probably related to the worker node not being joined to the cluster.您正在使用 - 所以/var/lib/kubelet/config.yaml为空的事实可能与工作节点未加入集群有关。

This might be related to networking issues - but lets try step by step:这可能与网络问题有关 - 但让我们逐步尝试:

1 ) Create a valid token for the worker node to join the cluster: 1 ) 为工作节点创建一个有效的令牌以加入集群:
Run: sudo kubeadm token create --print-join-command --v=5 and make sure you receive an output command like:运行: sudo kubeadm token create --print-join-command --v=5并确保您收到如下输出命令:

kubeadm join <master-node-ip>:6443 --token aa334.. --discovery-token-ca-cert-hash sha256:..

2 ) Run the provided command in the worker node. 2 ) 在工作节点中运行提供的命令。

3 ) If Everything is OK - the /var/lib/kubelet/config.yaml should be populated and the status of sudo systemctl status kubelet should look good. 3)如果一切正常 - 应该填充/var/lib/kubelet/config.yaml并且sudo systemctl status kubelet应该看起来不错。

4 ) If you can an error - try running the ame join command with --v=5 - you'll probably see some networking issues. 4 ) 如果您可以出错 - 尝试使用--v=5运行 ame join 命令 - 您可能会看到一些网络问题。

4.A ) If you got an error like dial tcp <master-ip>:6443: connect: no route to host - make sure that you have a communication between your nodes - run curl <master-node-ip>:6443 from worker node - you'll probably get the same no route error. 4.A ) 如果您遇到类似dial tcp <master-ip>:6443: connect: no route to host - 确保您的节点之间有通信 - 运行curl <master-node-ip>:6443 from工作节点 - 您可能会遇到相同的no route错误。
Go to the master node and open the 6443 port (I'll assume you're working on secured private network) and try the connectivity again.转到主节点并打开6443 port (我假设您正在使用安全的专用网络)并再次尝试连接。

4.B ) If opening port in master succeed and you're able to curl from worker to master you should receive a response from the API server like: Client sent an HTTP request to an HTTPS server . 4.B ) 如果在 master 中打开端口成功并且您能够从 worker 卷曲到 master,您应该收到来自 API 服务器的响应,例如: Client sent an HTTP request to an HTTPS server

5 ) If curl succeed but you're still facing connectivity problems try: 5 ) 如果curl成功但您仍然面临连接问题,请尝试:

5.A ) Comparing the .kube/config files of master and worker nodes - make sure the IP of the API server is correct. 5.A ) 比较主节点和工作节点的.kube/config文件 - 确保 API 服务器的 IP 正确。

5.B ) Make sure you enabled bridge networking mode on all nodes: sudo sysctl net.bridge.bridge-nf-call-iptables=1 . 5.B ) 确保您在所有节点上启用了桥接网络模式: sudo sysctl net.bridge.bridge-nf-call-iptables=1

5.C ) Make sure you have an SDN solution likecalico , flannel or weave and that you see that the relevant kube-system pods are running: 5.C)确保你有一个SDN解决方案像棉布绒布编织和你看到有关KUBE-系统吊舱正在运行:

$kubectl -n kube-system get pods
NAME                                      READY   STATUS    RESTARTS   AGE
coredns-f9fd979d6-lpdlc                   1/1     Running   2          7d12h
coredns-f9fd979d6-vcs7g                   1/1     Running   2          7d12h
etcd-master-node-k8s                      1/1     Running   2          7d12h
kube-apiserver-master-node-k8s            1/1     Running   2          7d12h
kube-controller-manager-master-node-k8s   1/1     Running   2          7d12h
kube-proxy-kh2lc                          1/1     Running   2          7d12h
kube-proxy-lfmc4                          1/1     Running   0          4m36s
kube-scheduler-master-node-k8s            1/1     Running   2          7d12h
weave-net-59r5b                           2/2     Running   6          7d11h <-- Here 
weave-net-c44d6                           2/2     Running   1          4m36s <-- Here

6 ) If nothing works - try running kubeadm reset on the worker node. 6 ) 如果没有任何效果 - 尝试在工作节点上运行kubeadm reset

My Env: 3 master and forefront loadbalancer while initializing cluster My Env:初始化集群时的 3 个主负载均衡器和最前沿负载均衡器

Had to adjust my load balancer and take out other master nodes and "kubeadm init" and issue was gone , Network error possibly .不得不调整我的负载均衡器并取出其他主节点和“kubeadm init”,问题就消失了,可能是网络错误。

Then readd all the other master nodes.然后读取所有其他主节点。

Had a similar issue where got that error on control join command.有一个类似的问题,在控制连接命令上出现错误。 It ended up being the load balancer ip assigned to the joining control as secondary from a previous installation (which has nothing running behind it).它最终成为分配给加入控制的负载均衡器 ip,作为以前安装的辅助设备(后面没有任何运行)。 Run ip a and make sure that the joining node does not have secondary load balancer ip.运行ip a并确保加入节点没有辅助负载均衡器 ip。

echo y | kubeadm reset || true
rm -rf /etc/cni/net.d || true
rm -rf /var/lib/etcd || true
rm -rf ~/.kube || true
ip address delete {{ k8s_apiserver_vip_cidr }} dev {{ k8s_interface }}
kubeadm join ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 kube.netes 1.12.2 无法加载 Kubelet 配置文件 /var/lib/kubelet/config.yaml - kubernetes 1.12.2 failed to load Kubelet config file /var/lib/kubelet/config.yaml 将节点添加到 kubernetes 集群导致无法加载 Kubelet 配置文件 /var/lib/kubelet/config.yaml 并且在 /etc/cni/net.d 中找不到网络 - Adding node to kubernetes cluster gives failed to load Kubelet config file /var/lib/kubelet/config.yaml and no networks found in /etc/cni/net.d kubelet从错误的配置文件读取? - kubelet reading from wrong config file? 如何为 kubernetes v1.14.0 生成 Kubelet 配置文件 - How to generate Kubelet config file for kubernetes v1.14.0 kubelet无法找到CPU的安装点 - kubelet failed to find mountpoint for CPU Kubernetes在所有集群上更改kubelet配置 - Kubernetes change kubelet config at all cluster 无法通过 alpha 阶段获取 kubelet 配置 - Unable to get kubelet config by alpha phase 很多 kubelet 错误:无法更新容器的统计信息 - A lot of kubelet errors : Failed to update stats for container 重新启动工作节点 docker 服务时缺少 Kubelet 配置 yaml - Kubelet config yaml is missing when restart work node docker service kubelet 因 kubelet cgroup 驱动程序失败:“cgroupfs”不同于 docker cgroup 驱动程序:“systemd” - kubelet failed with kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM