k8s spring 启动 pod 准备就绪和活跃度探测失败

Question

I have configured a spring-boot pod and configured the liveness and readiness probes.我已经配置了一个 spring-boot pod 并配置了liveness和readiness探针。 When I start the pod, the describe command is showing the below output.当我启动 pod 时， describe命令显示以下 output。

Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  92s                default-scheduler  Successfully assigned pradeep-ns/order-microservice-rs-8tqrv to pool-h4jq5h014-ukl3l
  Normal   Pulled     43s (x2 over 91s)  kubelet            Container image "classpathio/order-microservice:latest" already present on machine
  Normal   Created    43s (x2 over 91s)  kubelet            Created container order-microservice
  Normal   Started    43s (x2 over 91s)  kubelet            Started container order-microservice
  Warning  Unhealthy  12s (x6 over 72s)  kubelet            Liveness probe failed: Get "http://10.244.0.206:8222/actuator/health/liveness": dial tcp 10.244.0.206:8222: connect: connection refused
  Normal   Killing    12s (x2 over 52s)  kubelet            Container order-microservice failed liveness probe, will be restarted
  Warning  Unhealthy  2s (x8 over 72s)   kubelet            Readiness probe failed: Get "http://10.244.0.206:8222/actuator/health/readiness": dial tcp 10.244.0.206:8222: connect: connection refused

The pod definition is like below pod定义如下

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: order-microservice-rs
  labels:
    app: order-microservice
spec:
  replicas: 1
  selector:
    matchLabels:
      app: order-microservice
  template:
    metadata:
      name: order-microservice
      labels:
        app: order-microservice
    spec:
      containers:
        - name: order-microservice
          image: classpathio/order-microservice:latest
          imagePullPolicy: IfNotPresent
          env:
            - name: SPRING_PROFILES_ACTIVE
              value: dev
            - name: SPRING_DATASOURCE_USERNAME
              valueFrom:
                secretKeyRef:
                  key: username
                  name: db-credentials
            - name: SPRING_DATASOURCE_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: password
                  name: db-credentials
          volumeMounts:
            - name: app-config
              mountPath: /app/config
            - name: app-logs
              mountPath: /var/log
          livenessProbe:
            httpGet:
              port: 8222
              path: /actuator/health/liveness
            initialDelaySeconds: 10
            periodSeconds: 10
          readinessProbe:
            httpGet:
              port: 8222
              path: /actuator/health/readiness
            initialDelaySeconds: 10
            periodSeconds: 10
          resources:
            requests:
              memory: "550Mi"
              cpu: "500m"
            limits:
              memory: "550Mi"
              cpu: "750m"
      volumes:
        - name: app-config
          configMap:
            name: order-microservice-config
        - name: app-logs
          emptyDir: {}
      restartPolicy: Always

If I disable the liveness and readiness probe in the replica-set manifest and I exec into the pod, I am getting a valid response when invoking http://localhost:8222/actuator/health/liveness and http://localhost:8222/actuator/health/readiness endpoint.如果我在replica-set清单中禁用liveness性和readiness性探测并exec到 pod，则在调用http://localhost:8222/actuator/health/liveness liveness 和http://localhost:8222/actuator/health/readiness时会收到有效响应http://localhost:8222/actuator/health/readiness端点。 Why is my pod restarting and failing when invoking the readiness and liveness endpoint with Kubernetes.为什么在使用liveness调用readiness和活跃度端点时，我的 pod 会重新启动并失败。 Where am I going wrong?我哪里错了？

Update If I remove the resource section, the pods are running but when added the resource parameters, the probes are failing.更新如果我删除了resource部分，则 pod 正在运行，但是当添加resource参数时， probes失败。

Answer 1

When you limit the container / spring application to 0.5 cores (500 millicores) the startup probably takes longer than the given liveness probe thresholds.当您将容器 / spring 应用程序限制为 0.5 个内核（500 毫内核）时，启动可能需要比给定的活动探测阈值更长的时间。

You can either increase them, or use a startupProbe with more relaxed settings (fe failureThreshold 10).您可以增加它们，或者使用具有更宽松设置的 startupProbe（fe failureThreshold 10）。 You can reduce the period for the liveness probe in that case and get faster feedback after a successful container start was detected.在这种情况下，您可以减少 liveness 探测的时间，并在检测到容器启动成功后获得更快的反馈。

Answer 2

Your pod config only give 0.5 Core of CPU, and your check time was too short.您的 pod 配置仅提供 0.5 核 CPU，并且您的检查时间太短。 The spring boot start may take a long time more than 10 seconds according your server CPU performance.根据您的服务器 CPU 性能，spring 启动可能需要超过 10 秒的时间。 This is my config of spring boot pod may give you a point.这是我的 spring 引导 pod 的配置可能会给你一个点。

"livenessProbe": {
              "httpGet": {
                "path": "/actuator/liveness",
                "port": 11032,
                "scheme": "HTTP"
              },
              "initialDelaySeconds": 90,
              "timeoutSeconds": 30,
              "periodSeconds": 30,
              "successThreshold": 1,
              "failureThreshold": 3
            },
            "readinessProbe": {
              "httpGet": {
                "path": "/actuator/health",
                "port": 11032,
                "scheme": "HTTP"
              },
              "initialDelaySeconds": 60,
              "timeoutSeconds": 30,
              "periodSeconds": 30,
              "successThreshold": 1,
              "failureThreshold": 3
            },

and I did not limit the CPU and memory resource, if you limit the CPU, it will take more time.而且我没有限制CPU和memory资源，如果你限制CPU，会花费更多时间。 Hop this could help you.跳这可以帮助你。

Answer 3

When you are trying the request against your localhost , and it works, it is not a guarantee that it is going to work on other network interfaces.当您尝试针对localhost的请求并且它可以工作时，它不能保证它可以在其他网络接口上工作。 Kubelet is a node agent, so the request is going to your eth0 , or equivalent, not your localhost . Kubelet 是一个节点代理，因此请求将发送到您的eth0或等效项，而不是您的localhost 。

You can check it by making the request from another pod to your pod's IP address, or the service backing it up.您可以通过从另一个 pod 向您的 pod 的 IP 地址或备份它的服务发出请求来检查它。

Probably you are making your application to serve on localhost , while you have to make it serve on 0.0.0.0 , or eth0 .可能您正在使您的应用程序在localhost上提供服务，而您必须使其在0.0.0.0或eth0上提供服务。

k8s spring 启动 pod 准备就绪和活跃度探测失败

问题描述

3 个解决方案

解决方案1
1 2022-02-02 15:41:31

解决方案2
1 2022-02-02 16:05:42

解决方案3
0 2022-02-02 16:27:31

k8s spring 启动 pod 准备就绪和活跃度探测失败

问题描述

3 个解决方案

解决方案1 1 2022-02-02 15:41:31

解决方案2 1 2022-02-02 16:05:42

解决方案3 0 2022-02-02 16:27:31

解决方案1
1 2022-02-02 15:41:31

解决方案2
1 2022-02-02 16:05:42

解决方案3
0 2022-02-02 16:27:31