如何刪除 Kubernetes 'shutdown' pod

Question

我最近注意到大量具有“關閉”狀態的 pod。 自 2020 年 10 月以來，我們一直在使用 Kubernetes。

生產和登台運行在相同的節點上，除了登台使用搶占節點來降低成本。 容器在暫存時也很穩定。 （失敗很少發生，因為它們在之前的測試中被發現）。

服務提供商 Google Cloud Kubernetes。

我熟悉了文檔並嘗試搜索，但是我發現谷歌都沒有幫助解決這個特定的狀態。 日志中沒有錯誤。

我沒有問題停止豆莢。 理想情況下，我希望 K8s 自動刪除這些關閉的 pod。 如果我運行kubectl delete po redis-7b86cdccf9-zl6k9 ，它會瞬間消失。

kubectl get pods | grep Shutdown | awk '{print $1}' | xargs kubectl delete pod kubectl get pods | grep Shutdown | awk '{print $1}' | xargs kubectl delete pod是手動臨時解決方法。

PS。 k在我的環境中是kubectl的別名。

最后一個例子：它發生在所有命名空間 // 不同的容器中。

我偶然發現了一些解釋狀態的相關問題https://github.com/kubernetes/website/pull/28235 https://github.com/kubernetes/kubernetes/issues/102820

“當 pod 在正常節點關閉期間被驅逐時，它們被標記為失敗。運行kubectl get pods將被驅逐的 pod 的狀態顯示為Shutdown 。”

Answer 1

被驅逐的 pod 不會被故意移除，正如 k8s 團隊在此處所說的1 ，被驅逐的 pod 也不會被移除以便在驅逐后進行檢查。

我相信這里最好的方法是創建一個已經提到的 cronjob 2 。

apiVersion: batch/v1
kind: CronJob
metadata:
  name: del-shutdown-pods
spec:
  schedule: "* 12 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            imagePullPolicy: IfNotPresent
            command:
            - /bin/sh
            - -c
            - kubectl get pods | grep Shutdown | awk '{print $1}' | xargs kubectl delete pod
          restartPolicy: OnFailure

Answer 2

您不需要任何grep - 只需使用 kubectl 提供的選擇器。 而且，順便說一句，您不能從busybox 映像中調用kubectl，因為它根本沒有kubectl 。 我還創建了一個具有 pod 刪除權限的服務帳戶。

apiVersion: batch/v1
kind: CronJob
metadata:
  name: del-shutdown-pods
spec:
  schedule: "0 */2 * * *"  
  concurrencyPolicy: Replace
  jobTemplate:
    metadata:
      name: shutdown-deleter
    spec:
      template:
        spec:
          serviceAccountName: deleter
          containers:
          - name: shutdown-deleter
            image: bitnami/kubectl
            imagePullPolicy: IfNotPresent
            command:
              - "/bin/sh"
            args:
              - "-c"
              - "kubectl delete pods --field-selector status.phase=Failed -A --ignore-not-found=true"
          restartPolicy: Never

Answer 3

首先，嘗試使用以下命令強制刪除 kubernetes pod：

$ kubectl 刪除 pod <pod_name> -n --grace-period 0 --force

您可以使用以下命令直接刪除 pod：

$ kubectl 刪除 pod

然后，使用以下命令檢查 pod 的狀態：

$ kubectl 獲取 pod

在這里，您將看到 pod 已被刪除。

您還可以使用 yaml 文件中的文檔進行驗證。

大多數程序在收到 SIGTERM 時會正常關閉，但如果您使用第三方代碼或正在管理您無法控制的系統，preStop 掛鈎是無需修改應用程序即可觸發正常關閉的好方法。 Kubernetes 將向 pod 中的容器發送 SIGTERM 信號。 此時，Kubernetes 會等待一段稱為終止寬限期的指定時間。

有關更多信息，請參閱。

Answer 4

現在 Kubernetes 默認不會刪除被驅逐和關閉狀態的 Pod。 我們在環境中也面臨着類似的問題。

作為一個自動修復，您可以創建一個 Kubernetes cronjob，它可以刪除具有驅逐和關閉狀態的 pod。 Kubernetes cronjob pod 可以使用 serviceaccount 和 RBAC 進行身份驗證，您可以在其中限制實用程序的動詞和命名空間。

Answer 5

您可以使用https://github.com/hjacobs/kube-janitor 。這提供了各種可配置的選項來清理

Answer 6

我對這個問題的看法是這樣的（來自其他解決方案的靈感）：

# Delete all shutdown pods. This is common problem on kubernetes using preemptible nodes on gke
# why awk, not failed pods: https://github.com/kubernetes/kubernetes/issues/54525#issuecomment-340035375
# due fact failed will delete evicted pods, that will complicate pod troubleshooting

---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: del-shutdown-pods
  namespace: kube-system
  labels:
    app: shutdown-pod-cleaner
spec:
  schedule: "*/1 * * * *"
  successfulJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        metadata:
          labels:
            app: shutdown-pod-cleaner
        spec:
          volumes:
          - name: scripts
            configMap:
              name: shutdown-pods-scripts
              defaultMode: 0777
          serviceAccountName: shutdown-pod-sa
          containers:
          - name: zombie-killer
            image: bitnami/kubectl
            imagePullPolicy: IfNotPresent
            command:
              - "/bin/sh"
            args:
              - "-c"
              - "/scripts/podCleaner.sh"
            volumeMounts:
              - name: scripts
                mountPath: "/scripts"
                readOnly: true
          restartPolicy: OnFailure
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: shutdown-pod-cleaner
  namespace: kube-system
  labels:
    app: shutdown-pod-cleaner
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["delete", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: shutdown-pod-cleaner-cluster
  namespace: kube-system
subjects:
- kind: ServiceAccount
  name: shutdown-pod-sa
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: shutdown-pod-cleaner
  apiGroup: ""
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: shutdown-pod-sa
  namespace: kube-system
  labels:
    app: shutdown-pod-cleaner
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: shutdown-pods-scripts
  namespace: kube-system
  labels:
    app: shutdown-pod-cleaner
data:
  podCleaner.sh: |
    #!/bin/sh
    if [ $(kubectl get pods --all-namespaces --ignore-not-found=true | grep Shutdown | wc -l) -ge 1 ]
    then
    kubectl get pods -A | grep Shutdown | awk '{print $1,$2}' | xargs -n2 sh -c 'kubectl delete pod -n $0 $1 --ignore-not-found=true'
    else
    echo "no shutdown pods to clean"
    fi

Answer 7

我剛剛設置了一個 cronjob 來清理死掉的 GKE pod。 完整的設置包括 RBAC 角色、角色綁定和服務帳戶。

服務帳戶和集群角色設置。

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pod-accessor-role
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "delete", "watch", "list"]
---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: pod-access
subjects:
- kind: ServiceAccount
  name: cronjob-svc
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: pod-accessor-role
  apiGroup: rbac.authorization.k8s.io

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cronjob-svc
  namespace: kube-system

Cronjob 清理死掉的 pod。

apiVersion: batch/v1
kind: CronJob
metadata:
  name: pod-cleaner-cron
  namespace: kube-system
spec:
  schedule: "0 */12 * * *"
  successfulJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        metadata:
          name: pod-cleaner-cron
          namespace: kube-system
        spec:
          serviceAccountName: cronjob-svc
          restartPolicy: Never
          containers:
          - name: pod-cleaner-cron
            imagePullPolicy: IfNotPresent
            image: bitnami/kubectl
            command:
              - "/bin/sh"
            args:
              - "-c"
              - "kubectl delete pods --field-selector status.phase=Failed -A --ignore-not-found=true"
status: {}

如何刪除 Kubernetes 'shutdown' pod

問題描述

7 個解決方案

解決方案1
7 已采納 2021-09-09 11:51:46

解決方案2
7 2021-10-18 13:18:43

解決方案3
0 2021-07-15 10:59:39

解決方案4
0 2021-09-08 05:47:36

解決方案5
0 2021-09-08 07:31:12

解決方案6
0 2022-02-06 11:12:48

解決方案7
0 2022-07-05 16:10:29

如何刪除 Kubernetes &#39;shutdown&#39; pod

問題描述

7 個解決方案

解決方案1 7 已采納 2021-09-09 11:51:46

解決方案2 7 2021-10-18 13:18:43

解決方案3 0 2021-07-15 10:59:39

解決方案4 0 2021-09-08 05:47:36

解決方案5 0 2021-09-08 07:31:12

解決方案6 0 2022-02-06 11:12:48

解決方案7 0 2022-07-05 16:10:29

如何刪除 Kubernetes 'shutdown' pod

解決方案1
7 已采納 2021-09-09 11:51:46

解決方案2
7 2021-10-18 13:18:43

解決方案3
0 2021-07-15 10:59:39

解決方案4
0 2021-09-08 05:47:36

解決方案5
0 2021-09-08 07:31:12

解決方案6
0 2022-02-06 11:12:48

解決方案7
0 2022-07-05 16:10:29