简体   繁体   English

即使删除Pod,kubelet仍会记录洪水

[英]kubelet logs flooding even after pods deleted

Kubernetes version : v1.6.7
Network plugin : weave

I recently noticed that my entire cluster of 3 nodes went down. 我最近注意到,我的3个节点的整个集群都崩溃了。 Doing my initial level of troubleshooting revealed that /var on all nodes was 100% . 完成我的故障排除的初始级别后,发现所有节点上的/var100%

Doing further into the logs revealed the logs to be flooded by kubelet stating 进一步对原木进行处理后,发现原木将被kubelet淹没,说明

Jan 15 19:09:43 test-master kubelet[1220]: E0115 19:09:43.636001    1220 kuberuntime_gc.go:138] Failed to stop sandbox "fea8c54ca834a339e8fd476e1cfba44ae47188bbbbb7140e550d055a63487211" before removing: rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod "<TROUBLING_POD>-1545236220-ds0v1_kube-system" network: CNI failed to retrieve network namespace path: Error: No such container: fea8c54ca834a339e8fd476e1cfba44ae47188bbbbb7140e550d055a63487211
Jan 15 19:09:43 test-master kubelet[1220]: E0115 19:09:43.637690    1220 docker_sandbox.go:205] Failed to stop sandbox "fea94c9f46923806c177e4a158ffe3494fe17638198f30498a024c3e8237f648": Error response from daemon: {"message":"No such container: fea94c9f46923806c177e4a158ffe3494fe17638198f30498a024c3e8237f648"}

The <TROUBLING_POD>-1545236220-ds0v1 was being initiated due to a cronjob and due to some misconfigurations, there were errors occurring during the running of those pods and more pods were being spun up. <TROUBLING_POD>-1545236220-ds0v1由于cronjob而启动,并且由于某些错误配置而引起,在运行这些pod的过程中发生了错误,并且正在旋转更多的pod。

So I deleted all the jobs and their related pods. 因此,我删除了所有作业及其相关的窗格。 So I had a cluster that had no jobs/pods running related to my cronjob and still see the same ERROR messages flooding the logs. 因此,我有一个群集,该群集没有与我的cronjob相关的正在运行的作业/吊舱,并且仍然看到相同的错误消息充斥着日志。

I did : 我做了:

1) Restart docker and kubelet on all nodes. 1)在所有节点上重新启动docker和kubelet。

2) Restart the entire control plane 2)重新启动整个控制平面

and also 3) Reboot all nodes. 并且3)重新启动所有节点。

But still the logs are being flooded with the same error messages even though no such pods are even being spun up. 但是,即使没有旋转这样的容器,日志仍然充满相同的错误消息。

So I dont know how can I stop kubelet from throwing out the errors. 所以我不知道如何阻止kubelet抛出错误。

Is there a way for me to reset the network plugin I am using ? 有没有办法让我重置正在使用的网络插件? Or do something else ? 还是做其他事情?

Check if the pod directory exists under /var/lib/kubelet 检查pod目录是否存在于/var/lib/kubelet

You're on a very old version of Kubernetes, upgrading will fix this issue. 您使用的是Kubernetes的旧版本,升级将解决此问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM