简体   繁体   中英

Kubernetes CronJob only runs the same job even though other jobs are waiting in limited number of pods per node condition

I'd like to know how kubernetes CronJob chooses the job to run when there are multiple waiting jobs. It is not FIFO, but LIFO?

Here is the settings of my experiment.

  • Kubernetes Server Version 1.21.5
  • 1 node in kubernetes cluster
  • limit 3 pods per node by setting ResourceQuota to namespace

I scheduled 9 CronJobs ( cronjob1 .. cronjob9 ) with different name. Each job is like the followings:

  • it takes 130 sec (just sleep)
  • schedule: */2 * * * *
  • concurrencyPolicy: Forbid
  • startingDeadlineSeconds: 3000
  • successfulJobsHistoryLimit: 0
  • failedJobsHistoryLimit: 1

Here is the result.

  • First, 3 CronJobs , say job1 , job2 , job3 , become running. Which 3 seems random.
  • Since each job takes 130 sec to finish, next schedule timing came.
  • After job1 , job2 , job3 finished, the same tasks job1 , job2 , job3 are started.
  • job4 - job9 are never executed.

Update

  • My cluster has only single node.
    • Kubernetes on Docker Desktop for Mac
  • Here're files for limiting resource.

namespace.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: cron-job-ns

resource_quota.yaml

apiVersion: v1
kind: ResourceQuota
metadata:
  name: limit-number-of-pods
  namespace: cron-job-ns
spec:
  hard:
    count/pods: "3"

I have posted community wiki answer to summarise the topic.

I'd like to know how kubernetes cron job chooses the job to run when there are multiple waiting jobs. It is not FIFO, but LIFO?

User rkosegi good mentioned in the comment:

it's nether LIFO nor FIFO. It's much more complicated .

The user weibeld had added the explanation:

As mentioned by @rkosegi, this has probably more to do with the Kubernetes scheduler than with CronJobs and Jobs. All Jobs create a Pod when they are started by the CronJob according to its schedule. The scheduler then decides which of the pending Pods gets scheduled to the node and thus can run.

If you want to understand exactly how it works, see the following documentation:

Kubernetes Scheduler Fine Parallel Processing Using a Work Queue Jobs

It was in the scope of the Kubernetes scheduler, as you guys answered.

Especially as for the priority of the same type of waiting pods, as in this experiment, it is related to the Queue in the Kubernetes scheduler.

The Queue implementation was neither FIFO nor LIFO.

The waiting Pod seems to go back and forth between three queues, activeQ, podBackoffQ, and unschedulableQ depending on the timeout.

The relationship between the timeout timing and the Pod newly scheduled by CronJob determines the next Pod to be executed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM