简体   繁体   English

如何使用 Dask 并行化循环?

[英]How to parallelize a loop with Dask?

I find the Dask documentation quite confusing.我发现Dask 文档很混乱。 Let's say I have a function:假设我有一个 function:

import random
import dask

def my_function(arg1, arg2, arg3):
    val = random.uniform(arg1, arg2) 
    va2 = random.uniform(arg2, arg3)
    return val1 + val2

some_list = []
for i in range(100):
    some_num = dask.delayed(my_function)(arg1, arg2, arg3)
    some_list += [some_num]

computed_list = dask.compute(*some_list)

This thing is going to fail, due to my_function() not getting all 3 arguments.这件事会失败,因为my_function()没有得到所有 3 个 arguments。

How can I parallelize this snippet of code in dask ?如何在dask中并行化这段代码?


EDIT:编辑:

Seems to work if you put a @dask.delayed decorator on top of the function def and call it normally, but now the .compute() -method line throws:如果您将@dask.delayed装饰器放在 function def顶部并正常调用它,似乎可以工作,但现在.compute() -method 行抛出:

KilledWorker: ('my_function-ac3c88f1-53f8-4d36-a520-ff8c40c6ee61', <Worker 'tcp://127.0.0.1:35925', name: 1, memory: 0, processing: 10>)

I build a graph first and then call compute on it:我先构建一个图,然后在其上调用计算:

import random
import dask

@dask.delayed
def my_function(arg1, arg2, arg3):
    val1 = random.uniform(arg1, arg2) 
    val2 = random.uniform(arg2, arg3)
    return val1 + val2

arg1 = 1
arg2 = 2
arg3 = 3

some_list = []
for i in range(10):
    some_num = my_function(arg1, arg2, arg3)
    some_list.append(some_num)

graph = dask.delayed()(some_list)
# graph.visualize()
computed_list = graph.compute()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM