简体   繁体   English

IPython并行LoadBalancedView GIL

[英]IPython Parallel LoadBalancedView GIL

I use a loadbalancedview from Ipython.parallel to call a function on an iterable, ie 我使用来自Ipython.parallel的loadbalancedview来调用可迭代的函数,即

from IPython.parallel import Client
from functools import partial

rc = Client()
lview = rc.load_balanced_view()
lview.block = True

def func(arg0, arg1, arg2):
    return func2(arg0) + arg1 + arg2

def func2(arg0):
    return 2*arg0

answer = lview.map(partial(func, arg1=A, arg2=B), iterable_data)

Does the fact that func calls func2 make func not be executed in parallel (ie. does the GIL come into play?) I assume that when you call map, each cluster node gets a copy of func, but do they also get copies of func2. 事实上func调用func2使得func不能并行执行(即GIL是否起作用?)我假设当你调用map时,每个集群节点都获得了func的副本,但是它们也获得了func2的副本。 Further, does the fact that I use functools.partial cause any problems? 此外,我使用functools.partial的事实是否会导致任何问题?

Does the fact that func calls func2 make func not be executed in parallel (ie. does the GIL come into play?) func调用func2使func不能并行执行(即GIL是否会起作用?)

Not at all. 一点也不。 The GIL is not at all relevant here, nor is it ever relevant in the parallelism in IPython.parallel. GIL在这里并不是完全相关的,它也与IPython.parallel中的并行性无关。 The GIL only comes up when coordinating threads within each engine, or within the Client process itself. 只有在协调每个引擎内的线程或客户端进程本身时才会出现GIL。

I assume that when you call map, each cluster node gets a copy of func, but do they also get copies of func2. 我假设当你调用map时,每个集群节点都会得到一个func的副本,但是它们也会获得func2的副本。

It should , but this is actually where your code will have a problem. 应该 ,但这实际上是你的代码有问题的地方。 IPython does not automatically track closures, and code dependencies in the interactive namespace, so you will see: IPython不会自动跟踪交互命名空间中的闭包和代码依赖关系,因此您将看到:

AttributeError: 'DummyMod' object has no attribute 'func'

This is because partial(func, arg1=A, arg2=B) contains a reference to __main__.func , not the code of the local func itself. 这是因为partial(func, arg1=A, arg2=B)包含对__main__.func的引用,而不是本地func本身的代码。 When the partial arrives on the engine, it is deserialized, and __main__.func is requested, but undefined ( __main__ is the interactive namespace on the engine ). 当部分到达引擎时,它被反序列化,并且请求__main__.func ,但是未定义( __main__引擎上的交互命名空间)。 You can address this simply by ensuring that func and func2 are defined on the engines: 您可以通过确保在引擎上定义funcfunc2来解决此问题:

rc[:].push(dict(func=func, func2=func2))

At which point, your map should behave as expected. 此时,您的map应该按预期运行。

If you instruct IPython to use the enhanced pickling library dill it will get closer to not having to manually send references, but it doesn't cover every case. 如果你指示IPython使用增强的酸洗库dill ,它将更接近于不必手动发送引用,但它并不涵盖所有情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM