简体   繁体   English

在完成所有任务之前执行芹菜和弦

[英]Celery chord executed before all tasks are completed

I have a bunch of tasks that can be executed simultaneously, but once everything is ready I want to execute a final one. 我有一堆可以同时执行的任务,但是一旦一切准备就绪,我想执行最后一个。 I am using the following code: 我正在使用以下代码:

chunk_tasks = []
for index, chunk in enumerate(chunks):
    chunk_tasks.append(import_chunk.s(meta.pk))

g = group(chunk_tasks)
chord(g)(import_completed.s(meta.pk, max_lines=max_lines))

However it looks like import_completed is executed before all tasks are completed. 但是,看起来import_completed已在所有任务完成之前执行。 Also the import_chunk task looks like: 同样, import_chunk任务如下所示:

@task(bind=True, ignore_result=IGNORE_RESULTS)
def import_chunk(self, meta_pk):
    try:
        # do some stuff
    except Exception, e:
        if self.max_retries == self.request.retries:
            logger.exception('Unexpected error in import_chunk')
        raise self.retry(countdown=60, max_retries=3)

So the question is what am I doing wrong? 所以问题是我在做什么错?

Chord is a task that only executes after all of the tasks in a group have finished executing. 和弦是仅在组中的所有任务完成执行后才执行的任务。 So, it needs state of tasks in its header for synchronization. 因此,它需要在其标头中执行任务状态以进行同步。

But when you set ignore_result to your task , the worker will not store task state and return values for this task. 但是,当您将ignore_result设置为您的task ,工作ignore_result 将不会存储任务状态并返回此任务的值。

This will lead to retrying of task or throwing exception or any malfunction according to your workflow. 根据您的工作流程,这将导致重试任务或引发异常或任何故障。

So, chord(add.s(i, i) for i in range(10))(tsum.s()).get() is perfectly valid and gives result for CASE 1 but it gives some trouble for CASE 2. 因此, chord(add.s(i, i) for i in range(10))(tsum.s()).get()是完全有效的,可以为情况1提供结果,但为情况2带来一些麻烦。

CASE 1: 情况1:

@app.task
def add(x, y):
    return x + y

@app.task
def tsum(numbers):
    return sum(numbers)

CASE 2: 情况2:

@app.task(ignore_result=True)
def add(x, y):
    return x + y

@app.task(ignore_result=True)
def tsum(numbers):
    return sum(numbers)

So, you have to change ignore_result or change the workflow of your tasks. 因此,您必须更改ignore_result或更改任务的工作流程。

From docs: 从文档:

You should avoid using chords as much as possible. 您应尽可能避免使用和弦。 Still, the chord is a powerful primitive to have in your toolbox as synchronization is a required step for many parallel algorithms. 尽管如此,和弦仍然是您工具箱中强大的原语,因为同步是许多并行算法所必需的步骤。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM