简体   繁体   English

如果尚未使用celery安排任务,则允许执行任务

[英]Allow a task execution if it's not already scheduled using celery

I'm using Celery to handle task scheduling in a Django app I'm developing, I'm working with the Django database just for testing. 我正在使用Celery来处理我正在开发的Django应用程序中的任务调度,我正在使用Django数据库进行测试。

I just tried several things to handle the execution of a task only if it's not already scheduled or in progress like the proposed in this article , but nothing work so far. 我只是想几件事情要处理,只有当它不是已经计划任务的执行,或者在这样的提议进步文章 ,到目前为止,但没有工作。

Something like this : 像这样的东西:

task.py task.py

@task()
def add(x, y):
   return x + y

And then when you call it twice like in the following way: 然后当你按照以下方式调用它两次时:

import myapp.tasks.add

myapp.tasks.add.apply_async((2,2), task_id=1, countdown=15)
myapp.tasks.add.apply_async((2,2), task_id=2, countdown=15)

It should be allowing one instance based in the countdown=15 . 它应该允许一个基于countdown=15实例。 How I can accomplish that the second call never execute it if there is another running or waiting? 如果有另一个正在运行或等待的话,我怎么能完成第二个调用从不执行它?

One problem with the accepted answer is that it is slow. 接受的答案的一个问题是它很慢。 Checking if a task is already running involves making a call to the broker and then iterating through both the running and active tasks. 检查任务是否已在运行涉及调用代理,然后迭代运行和活动任务。 If you want to queue up the task fast this won't work. 如果您想快速排队任务,这将无法正常工作。 Also the current solution has a small race condition, in that 2 processes could be checking if the task has been queued at the same (find out it isn't), which would then queue up 2 tasks. 此外,当前解决方案具有较小的竞争条件,因为2个进程可以检查任务是否已经排队等同(发现它不是),这将排队2个任务。

A better solution would be to what I call debounced tasks. 更好的解决方案是我称之为去抖动的任务。 Basically you increment a counter each time you queue a task. 基本上,每次排队任务时都会增加一个计数器。 When the task starts you decrement it. 当任务开始时,你减少它。 Use redis and then it's all atomic. 使用redis然后它都是原子的。

eg 例如

Queue up the task: 排队任务:

conn = get_redis()
conn.incr(key)
task.apply_async(args=args, kwargs=kwargs, countdown=countdown)

Then in the task, you have 2 options, do you want to execute the task 15 seconds after the first one was queued (throttle) or execute it 15 seconds after the last one was queued (debounce). 然后在任务中,您有2个选项,是否要在第一个排队(节流)后15秒执行任务,或者在最后一个排队后15秒执行任务(去抖动)。 That is, if we keep trying to run the same task do we extend the timer, or do we just wait 15 for the first one and ignore the other tasks that were queued. 也就是说,如果我们继续尝试运行相同的任务,我们是否延长了计时器,或者我们只是等待第一个计时器15并忽略排队的其他任务。

Easy to support both, here is debounce where we wait until the tasks stops getting queued: 容易支持两者,这里是debounce,我们等到任务停止排队:

conn = get_redis()
counter = conn.decr(key)
if counter > 0:
    # task is queued
    return
# continue on to rest of task

Throttle version: 油门版本:

counter = conn.getset(key, '0')
if counter == '0':
    # we already ran so ignore all the tasks that were queued since
    return
# continue on to task

Another benefit of this solution over the accepted is that the key is entirely under your control. 该解决方案相对于接受的另一个好处是,密钥完全在您的控制之下。 So if you want the same task to be executing but only once for different id/objects for example, you incorporate that into your key. 因此,如果您希望执行相同的任务,但仅针对不同的ID /对象执行一次,则将其合并到您的密钥中。

Update 更新

Was thinking about this even more, you can do the throttle version even easier without having to queue up tasks. 考虑到这一点,你可以更轻松地完成油门版本,而无需排队任务。

Throttle v2 (when queuing up the task) 节流v2(排队任务时)

conn = get_redis()
counter = conn.incr(key)
if counter == 1:
    # queue up the task only the first time
    task.apply_async(args=args, kwargs=kwargs, countdown=countdown)

Then in the task you set the counter back to 0. 然后在任务中将计数器设置回0。

You don't even have to use a counter, if you had a set you could add the key to the set. 您甚至不必使用计数器,如果您有一个集合,您可以将密钥添加到集合中。 If you get back 1, then the key wasn't in the set and you should queue the task. 如果你回到1,那么密钥不在集合中,你应该排队任务。 If you get back 0, then key is already in the set so don't queue the task. 如果返回0,则密钥已经在集合中,因此不要对任务进行排队。

Look before you leap! 三思而后行! You can check if there are any tasks running/waiting before you queue tasks. 在排队任务之前,您可以检查是否有任何正在运行/等待的任务。

from celery.task.control import inspect

def is_running_waiting(task_name):
    """
    Check if a task is running or waiting.
    """
    scheduled_tasks = inspect().scheduled().values()[0]
    for task in scheduled_tasks:
        if task['request']['name'] == task_name:
            return True
    running_tasks = inspect().active().values()[0]
    for task in running_tasks:
        if task['request']['name'] == task_name:
            return True

Now if you queue three add tasks, first one will be queued for execution, remaining wont be queued. 现在,如果您排队三个添加任务,第一个将排队等待执行,剩下的不会排队。

for i in range(3):
    if not is_running_waiting('add'):
        add.apply_async((2,2), countdown=15)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM