简体   繁体   中英

How to limit the maximum number of running Celery tasks by name

How do you limit the number of instances of a specific Celery task that can be ran simultaneously?

I have a task that processes large files. I'm running into a problem where a user may launch several tasks, causing the server to run out of CPU and memory as it tries to process too many files at once. I want to ensure that only N instances of this one type of task are ran at any given time, and that other tasks will sit queued in the scheduler until the others complete.

I see there's a rate_limit option in the task decorator, but I don't think this does what I want. If I'm understanding the docs correctly, this will just limit how quickly the tasks are launched, but it won't restrict the overall number of tasks running, so this will make my server will crash more slowly...but it will still crash nonetheless.

What you can do is to push these tasks to a specific queue and have X number of workers processing them. Having two workers on a queue with 100 items will ensure that there will only be two tasks processed at the same time.

You have to setup extra queue and set desired concurrency level for it. From Routing Tasks :

# Old config style    
CELERY_ROUTES = {
                'app.tasks.limited_task': {'queue': 'limited_queue'}
            } 

or

from kombu import Exchange, Queue
celery.conf.task_queues = (
        Queue('default', default_exchange, routing_key='default'),
        Queue('limited_queue', default_exchange, routing_key='limited_queue')
    ) 

And start extra worker, serving only limited_queue:

$ celery -A celery_app worker -Q limited_queue --loglevel=info -c 1 -n limited_queue

Then you can check everything running smoothly using Flower or inspect command:

$ celery -A celery_app worker inspect --help

I am not sure you can do that in Celery, what you can do is check how many tasks of that name are currently running when a request arrives and if it exceeds the maximum either return an error or add a mechanism that periodically checks if there are open slots for the tasks and runs it (if you add such a mechanism, you don't need to double check, just at each request add it to it's queue.

In order to check running tasks, you can use the inspect command.

In short:

app = Celery(...)
i = app.control.inspect()
i.active()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM