How to work a job queue with kubernetes with scaling

Question

I need a scalable queue handling based on docker/python worker. My thought went towards kube.netes. However, I am unsure about the best controller/service.

Based on azure functions I get incoming http traffic adding simple messages to a storage queue. Those messages need to be worked on and the results fed back into a result queue.

To process those queue messages I developed python code looping the queue and working on those jobs. After each successful loop, the message will be removed from the source-queue and the result written into the result-queue. Once the queue is empty the code exists.

So I created a docker image that runs the python code. If more than one container is started the queue gets worked faster obviously. I also implemented the new Azure Kube.netes Services to scale that. While being new to kube.netes I read about the job paradigm to work a queue until the job is ready. My simple yaml template looks like this:

apiVersion: batch/v1
kind: Job
metadata:
  name: myjob
spec:
  parallelism: 4
  template:
    metadata:
      name: myjob
    spec:
      containers:
      - name: c
        image: repo/image:tag

My problem now is, that the job cannot be restarted.

Usually, the queue gets filled with some entries and then for a while nothing happens. Then again bigger queues can arrive that need to be worked on as fast as possible. Of course, I want to run the job again then, but that seems not possible. Also, I want to reduce the footprint to a minimum if nothing is in the queue.

So my question is, what architecture/constructs should I use for this scenario and are there simple yaml examples for that?

Answer 1

This may be a "goofy/hacky" answer, but it's simple, robust, and I've been using it in a production system for months now.

I have a similar system where I have a queue that sometimes is emptied out and sometimes gets slammed. I wrote my queue processor similarly, it handles one message in the queue at a time and terminates if the queue is empty. It is set up to run in a Kubernetes job.

The trick is this: I created a CronJob to regularly start one single new instance of the job, and the job allows infinite parallelism. If the queue is empty, it immediately terminates ("scales down"). If the queue is slammed and the last job hadn't finished yet, another instance starts ("scales up").

No need to futz with querying the queue and scaling a statefulset or anything, and no resources are consumed if the queue is sitting empty. You may have to adjust the CronJob interval to fine tune how fast it reacts to the queue filling up, but it should react pretty well.

Answer 2

This is a common pattern, and there are several ways to architect a solution.

A common solution is to have an app with a set of workers always polling your queue (this could be your python script but you need to make it a service) and generally you'll want to use a Kubernetes Deployment possibly with an Horizontal Pod Autoscaler based on some metrics for your queue or CPU.

In your case, you'll want to make your script a daemon and poll the queue if there are any items (I assume you are already handling race conditions with parallelism). Then deploy this daemon with using a Kubernetes deployment and then you can scale up and down based metrics or schedule.

There are already job schedulers out there for many different languages too. One that is very popular is Airflow that it already has the ability to have 'workers', but this may be overkill for a single python script.

Answer 3

You can use Keda in a couple of ways for this:

Scaled deployments

allows you to define the Kube.netes Deployment or StatefulSet that you want KEDA to scale based on a scale trigger. KEDA will monitor that service and based on the events that occur it will automatically scale your resource out/in accordingly.

https://keda.sh/docs/2.9/concepts/scaling-deployments/

ScaledJob

You can also run and scale your code as Kube.netes Jobs. The primary reason to consider this option is to handle processing long running executions. Rather than processing multiple events within a deployment, for each detected event a single Kube.netes Job is scheduled. That job will initialize, pull a single event from the message source, and process to completion and terminate.

https://keda.sh/docs/2.9/concepts/scaling-jobs/

How to work a job queue with kubernetes with scaling

Question

3 answers

solution1
2 2019-01-04 20:48:40

solution2
2 2019-01-04 21:16:21

solution3
0 2023-01-20 16:56:32

Scaled deployments

ScaledJob

How to work a job queue with kubernetes with scaling

Question

3 answers

solution1 2 2019-01-04 20:48:40

solution2 2 2019-01-04 21:16:21

solution3 0 2023-01-20 16:56:32

Scaled deployments

ScaledJob

solution1
2 2019-01-04 20:48:40

solution2
2 2019-01-04 21:16:21

solution3
0 2023-01-20 16:56:32