简体   繁体   English

Kubernetes CronJob 未调度

[英]Kubernetes CronJob not scheduling

cronjob scheduled to run every minute. cronjob 计划每分钟运行一次。 It ran as scheduled for several days, however 29hrs ago a bad deployment caused the jobs to fail due to ImagePullError, since then the cronjob does not schedule any additional jobs.它按计划运行了几天,但是 29 小时前,由于 ImagePullError 错误部署导致作业失败,此后 cronjob 没有安排任何其他作业。 The only jobs I can run are the ones that are manually created from the cronjob我可以运行的唯一作业是从 cronjob 手动创建的作业

spec:
  concurrencyPolicy: Forbid
  failedJobsHistoryLimit: 2
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      template:
        metadata:
          creationTimestamp: null
          labels:
            app.kubernetes.io/instance: acd-processor
            app.kubernetes.io/name: acd-processor
        spec:
          containers:
          - image: acdacr.azurecr.io/processor:964f395
            imagePullPolicy: IfNotPresent
            name: acd-processor
            resources: {}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
          dnsPolicy: ClusterFirst
          imagePullSecrets:
          - name: acr-secret
          restartPolicy: OnFailure
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30
  schedule: '* * * * *'
  successfulJobsHistoryLimit: 1
  suspend: false
status:
  lastScheduleTime: "2022-01-27T21:07:00Z"
NAME                 COMPLETIONS   DURATION   AGE
job.batch/test-job   1/1           3s         20m

NAME                                        SCHEDULE    SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob.batch/acd-processor-cronjob   * * * * *   False     0        29h             8d

pod/test-job-ftx5w                              0/1     Completed   0          20m

test-job is manually created from the cronjob测试作业是从 cronjob 手动创建的

The cronjob is not suspended, and there are 0 active jobs so it should work even though concurrency is disabled. cronjob 没有挂起,并且有 0 个活动作业,因此即使禁用了并发,它也应该可以工作。

Why does the cronjob not continue to schedule as it's configured?为什么 cronjob 不能按照配置继续调度?

An almost identical question was asked at Kubernetes Not Scheduling CronJobKubernetes Not Scheduling CronJob提出了一个几乎相同的问题

I'll copy-paste my answer from there as it has some things for you to check我会从那里复制粘贴我的答案,因为它有一些东西需要你检查

A couple of other things you can check:您可以检查的其他几件事:

  1. Do you have any cron pods with a "failed" status?您是否有任何处于“失败”状态的 cron pod? If you do, check those pods for why.如果这样做,请检查这些 pod 以了解原因。
  2. Did it used to work and then suddenly stop?它曾经工作过然后突然停止了吗?
  3. Does the cronjob resource have anything in the events? cronjob 资源是否在事件中有任何内容? kubectl describe cronjob health-status-cron -n tango
  4. Does the code your cron runs take > 1 minute to complete?您的 cron 运行的代码是否需要超过 1 分钟才能完成? If so, your schedule is too aggressive, and you might want to loosen the schedule如果是这样,你的日程安排太激进了,你可能想放宽日程安排
  5. The cronjob controller also has some limitations you may want to check: https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#cron-job-limitations . cronjob controller 也有一些您可能需要检查的限制: https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#cron-job-limitations Specifically the concept of "missed jobs".特别是“错过的工作”的概念。 If the cronjob controller "misses" scheduling 100 or more jobs, it will "freeze" the job and not schedule it anymore.如果 cronjob controller “错过”调度 100 个或更多作业,它将“冻结”该作业并且不再安排它。 Do you scale down the cluster or similar when it is not in use?您是否在不使用集群时缩小集群或类似的规模?
  6. Do you have any custom/third-party webhooks or plugins installed in the cluster?集群中是否安装了任何自定义/第三方 webhook 或插件? These can interfere with pod creation.这些可能会干扰 pod 创建。
  7. Do you have any jobs created in the namespace?您是否在命名空间中创建了任何作业 kubectl get jobs -n tango If you find a ton of job objects, check them to see why they did not generate pods. kubectl get jobs -n tango如果您发现大量作业对象,请检查它们以了解它们没有生成 pod 的原因。

I encountered a somewhat similar issue in 2020 (writeup has a link to the issue I raised in the Kubernetes project itself): https://blenderfox.com/2020/08/07/the-snowball-effect-in-kubernetes/我在 2020 年遇到了一个有点类似的问题(文章中有一个链接指向我在 Kubernetes 项目本身中提出的问题): https://blenderfox.com/2020/08/07/the-snowball-effect-in-kubernetes/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM