简体   繁体   English

为什么Kubernetes pod不能正常停止?

[英]Why does not Kubernetes pod graceful stop?

I have encountered a problem that does not stop immediately even if I delete pod. 我遇到了即使删除吊舱也不会立即停止的问题。

What should be fixed in order to terminate normally? 为了正常终止,应该确定什么?

manifest file. 清单文件。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cmd-example
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cmd-example
  template:
    metadata:
      labels:
        app: cmd-example
    spec:
      terminationGracePeriodSeconds: 30
      containers:
      - name: cmd-container
        image: alpine:3.8
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
        command: ["/bin/sh"]
        args: ["-c", "while true; do exec sleep 100;done"]

Reproduction procedure 繁殖程序

  1. create deployment. 创建部署。
    $ kubectl apply -f deployments.yaml
  2. delete deployment. 删除部署。
    kubectl delete-f 020-deployments.yaml

kubectl get po -w output is. kubectl get po -w输出是。

cmd-example-5cccf79598-zpvmz   1/1       Running   0         2s
cmd-example-5cccf79598-zpvmz   1/1       Terminating   0         6s
cmd-example-5cccf79598-zpvmz   0/1       Terminating   0         37s
cmd-example-5cccf79598-zpvmz   0/1       Terminating   0         38s
cmd-example-5cccf79598-zpvmz   0/1       Terminating   0         38s

This should finish faster. 这应该完成得更快。
It took about 30 seconds to complete. 完成大约30秒。 Perhaps it is due to SIGKILL at the time of terminationGracePeriodSeconds 30s. 也许是由于SIGKILL在终止GracePeriodSeconds 30s时所致。
Why is not pod cleanup immediately with SIGTERM? 为什么不立即使用SIGTERM清理pod?

What should be fixed? 应该解决什么?

Environment 环境

I confirmed it in the following environment. 我在以下环境中确认了这一点。

  • Docker for Mac:18.06.1-ce,Kubernetes :v1.10.3 适用于Mac的Docker:18.06.1-ce,Kubernetes:v1.10.3
  • Docker for Windows:18.06.1-ce,Kubernetes :v1.10.3 适用于Windows的Docker:18.06.1-ce,Kubernetes:v1.10.3
  • Google Kubernete Engine:1.11.2-gke.15 谷歌Kubernete引擎:1.11.2-gke.15

Cause of a problem 问题原因

This shell is that it does not stop even if it accepts the signal of SIGTERM. 该外壳程序即使接受SIGTERM信号也不会停止。

Solution

Using the trap command. 使用trap命令。

Changed place 改变的地方

    command: ["/bin/sh"]
    args: ["-c", "trap 'exit 0' 15;while true; do exec sleep 100 & wait $!; done"]

Result 结果

after delete, pod was cleaned up as soon! 删除后,便会立即清理pod!

img-example-d68954677-mwsqp   1/1       Running   0         2s
img-example-d68954677-mwsqp   1/1       Terminating   0         8s
img-example-d68954677-mwsqp   0/1       Terminating   0         10s
img-example-d68954677-mwsqp   0/1       Terminating   0         11s
img-example-d68954677-mwsqp   0/1       Terminating   0         11s

Hiroki Matsumoto, the pod termination is behaving just like it was designed to behave. 松本裕树(Hiroki Matsumoto),吊舱端接的行为就像它的设计特性一样。 As you can find in documentation section on Pods: 正如您可以在Pod的文档部分中找到的那样:

Because pods represent running processes on nodes in the cluster, it is important to allow those processes to gracefully terminate when they are no longer needed (vs being violently killed with a KILL signal and having no chance to clean up). 由于Pod代表集群中节点上正在运行的进程,因此重要的是,当不再需要它们时,允许这些进程正常终止(与KILL信号将其猛烈杀死,并且没有机会进行清理)。

Long story short (based on official documentation) 长话短说(根据官方文档)

1) When you run kubectl delete -f deployments.yaml you send a command with time of grace period (by default 30 seconds) 1)运行kubectl delete -f deployments.yaml会发送宽限期的命令(默认为30秒)

2) when you run kubectl get pods you can see it has terminating state 2)当您运行kubectl get pods您可以看到它具有terminating状态

3) Kubelet sees this state and Pod starts to shutdown. 3)Kubelet看到此状态,Pod开始关闭。

4) After the grace period is over, if there is any processes still running it is killed with SIGKILL 4)宽限期结束后,如果仍有任何进程在运行,则使用SIGKILL将其杀死

So to delete a pod immediately you have to lower the grace period to 0 and run a forced/immediate deletion: 因此,要立即删除Pod,您必须将宽限期降低为0并运行强制/立即删除:

kubectl delete -f deployments.yaml --grace-period=0 --force and this results in an instant deletion. kubectl delete -f deployments.yaml --grace-period=0 --force ,这将导致立即删除。

Your pod literally does nothing. 您的广告连播实际上没有任何作用。 If you just want something where you can do occasional interactive debugging "inside the cluster", consider kubectl run to get a one-off interactive container 如果您只想在偶尔在集群内部进行交互式调试的地方,可以考虑运行kubectl以获得一次性的交互式容器

kubectl run --rm -it --name debug --image alpine:3.8

In terms of the command your pod spec is trying to run, rewriting it in shell script form: 根据您的Pod规范尝试运行的命令,以shell脚本形式重写它:

#!/bin/sh
# Forever:
while true
do
  # Replace this shell with a process that sleeps for
  # 100 ms, then exits
  exec sleep 100
  # The shell no longer exists and you'll never get here
done

I'm not clear what the pod is trying to do, but it at least won't exit if you remove the exec . 我不清楚Pod正在尝试做什么,但是如果您删除exec ,它至少不会退出。 (It will still sit in an idle loop forever.) (它仍将永远处于空闲循环中。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM