简体   繁体   English

在使用芹菜的简单任务中表现不佳

[英]Poor performance in simple tasks using celery

I am currently facing subpar performance when executing the following usecase: 我在执行以下用例时目前面临低于性能的问题:

I have two files - tasks.py 我有两个文件 - tasks.py

# tasks.py
from celery import Celery

app = Celery('tasks', broker='pyamqp://guest@localhost//', backend='rpc://',worker_prefetch_multiplier=1)

@app.task
def task(array_of_elements):
    return [x ** 2 for x in array_of_elements]

and run.py 和run.py

# run.py
from celery import group
from itertools import chain, repeat
from tasks import task
import time

def grouper(n, iterable, padvalue=None):
    return zip(*[chain(iterable, repeat(padvalue, n-1))]*n)

def fun1(x):
    return x ** 2

if __name__ == '__main__':
    start = time.time()
    items = [list(x) for x in grouper(10000, range(10000))]
    x = group([task.s(item) for item in items])
    r = x.apply_async()
    d = r.get()
    end = time.time()
    print(f'>celery: {end-start} seconds')

    start = time.time()
    res = [fun1(x) for x in range(10000)]
    end = time.time()
    print(f'>normal: {end-start} seconds')

When I am trying running celery: celery -A tasks worker --loglevel=info 当我尝试运行芹菜时:芹菜 - 任务工作者--loglevel = info

and trying to run: 并尝试运行:

python run.py

This is the output I get: 这是我得到的输出:

>celery: 0.19174742698669434 seconds
>normal: 0.004475116729736328 seconds

I have no idea why the performance is worse in celery? 我不知道为什么芹菜的表现会更差?

I am trying to understand how can I achieve map-reduce paradigm using celery like split a huge array into smaller chunks, do some processing and bring results back 我试图了解如何使用celery实现map-reduce范例,例如将一个巨大的数组拆分成更小的块,进行一些处理并将结果带回来

Am I missing some critical configuration? 我错过了一些关键配置吗?

Map-reduce paradigm is not supposed to be faster but to be better at scaling. Map-reduce范例不应该更快,但要更好地扩展。

There is always an overhead for a MR job compare to a local running job implementing the same computation : process scheduling, communication, shuffling, etc. 与实现相同计算的本地运行作业相比,MR作业始终存在开销:进程调度,通信,重排等。

Your benchmark is not relevant because MR and local run are either approaches, depending on the data set size. 您的基准测试不相关,因为MR和本地运行要么接近,要么取决于数据集大小。 At some point you swap from a local running approach to a MR approach because your dataset become too large for one node. 在某些时候,您将从本地运行方法切换到MR方法,因为您的数据集对于一个节点而言变得太大。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM