[英]Kubernetes pod stuck in state=Terminating after node goes to status = NotReady?
I have 3 node in k8s cluster, where all are masters, ie I have removed the node-role.kubernetes.io/master
taint.我在k8s集群中有3个节点,所有节点都是master,即我已经删除了
node-role.kubernetes.io/master
。
I physically removed the network cable on foo2
, so I have我物理上移除了
foo2
上的网络电缆,所以我有
kubectl get nodes
NAME STATUS ROLES AGE VERSION
foo1 Ready master 3d22h v1.13.5
foo2 NotReady master 3d22h v1.13.5
foo3 Ready master 3d22h v1.13.5
After several hours some of the pods are still in STATUS = Terminating
though I think they should be in Terminated
?几个小时后,一些 pod 仍处于
STATUS = Terminating
虽然我认为它们应该处于Terminated
?
I read athttps://www.bluematador.com/docs/troubleshooting/kubernetes-pod我在https://www.bluematador.com/docs/troubleshooting/kubernetes-pod阅读
In rare cases, it is possible for a pod to get stuck in the terminating state.
在极少数情况下,吊舱可能会卡在终止的 state 中。 This is detected by finding any pods where every container has been terminated, but the pod is still running.
这是通过查找每个容器已终止但 pod 仍在运行的任何 pod 来检测的。 Usually, this is caused when a node in the cluster gets taken out of service abruptly, and the cluster scheduler and controller-manager do not clean up all of the pods on that node.
通常,这是由于集群中的一个节点突然停止服务,并且集群调度程序和控制器管理器没有清理该节点上的所有 pod。
Solving this issue is as simple as manually deleting the pod using kubectl delete pod.
解决这个问题就像使用 kubectl delete pod 手动删除 pod 一样简单。
The pod describe says if unreachable for 5 minutes will be tolerated...吊舱描述说如果无法到达 5 分钟将被容忍......
Conditions:
Type Status
Initialized True
Ready False
ContainersReady True
PodScheduled True
Volumes:
etcd-data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
I have tried kubectl delete pod etcd-lns4g5xkcw
which just hung, though forcing it does work as per this answer ...我试过
kubectl delete pod etcd-lns4g5xkcw
刚刚挂起,尽管强制它确实按照这个答案工作......
kubectl delete pod etcd-lns4g5xkcw --force=true --grace-period=0
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "etcd-lns4g5xkcw" force deleted
(1) Why is this happening? (1) 为什么会这样? Shouln't it change to terminated?
不应该改成终止吗?
(2) Where even is STATUS = Terminating
coming from? (2)
STATUS = Terminating
来自哪里? At https://v1-13.docs.kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/ I See only Waiting/Running/Terminated as the options?在https://v1-13.docs.kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/我只看到等待/运行/终止作为选项?
Pods volume and network cleanup can consume more time while in termination
status. Pod 卷和网络清理在
termination
状态时可能会消耗更多时间。 Proper way to do it is to drain node in order to get pods terminated successfully in grace period.正确的方法是排空节点,以便在宽限期内成功终止 pod。 Because you plugged out the network cable the node has changed its status to
not ready
with pods already running on it.因为您拔掉了网络电缆,所以节点已将其状态更改为
not ready
,并且 pod 已经在其上运行。 Due to this pod could not terminate.由于此 pod 无法终止。
You may find this information from k8s documentation about terminating
status useful:您可能会从 k8s 文档中找到有关
terminating
状态的有用信息:
Kubernetes (versions 1.5 or newer) will not delete Pods just because a Node is unreachable.
Kubernetes(1.5 或更高版本)不会因为节点无法访问而删除 Pod。 The Pods running on an unreachable Node enter the 'Terminating' or 'Unknown' state after a timeout.
在无法访问的节点上运行的 Pod 在超时后进入“终止”或“未知”state。 Pods may also enter these states when the user attempts graceful deletion of a Pod on an unreachable Node:
当用户尝试优雅地删除无法访问的节点上的 Pod 时,Pod 也可能进入这些状态:
There are 3 suggested ways to remove it from apiserver:
有 3 种建议的方法可以从 apiserver 中删除它:
The Node object is deleted (either by you, or by the Node Controller).
节点 object 被删除(由您或由节点控制器)。 The kubelet on the unresponsive Node starts responding, kills the Pod and removes the entry from the apiserver.
无响应节点上的 kubelet 开始响应,杀死 Pod 并从 apiserver 中删除条目。 Force deletion of the Pod by the user.
用户强制删除 Pod。
Here you can find more information about background deletion from k8s offical documentation在这里你可以从k8s 官方文档中找到更多关于后台删除的信息
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.