简体   繁体   English

多处理池'apply_async'似乎只调用一次函数

[英]Multiprocessing pool 'apply_async' only seems to call function once

I've been following the docs to try to understand multiprocessing pools. 我一直在关注文档以尝试理解多处理池。 I came up with this: 我想出了这个:

import time
from multiprocessing import Pool

def f(a):
    print 'f(' + str(a) + ')'
    return True

t = time.time()
pool = Pool(processes=10)
result = pool.apply_async(f, (1,))
print result.get()
pool.close()
print ' [i] Time elapsed ' + str(time.time() - t)

I'm trying to use 10 processes to evaluate the function f(a) . 我正在尝试使用10个进程来评估函数f(a) I've put a print statement in f . 我在f了一个打印声明。

This is the output I'm getting: 这是我得到的输出:

$ python pooltest.py 
f(1)
True
 [i] Time elapsed 0.0270888805389

It appears to me that the function f is only getting evaluated once. 在我看来,函数f只被评估一次。

I'm likely not using the right method but the end result I'm looking for is to run f with 10 processes simultaneously, and get the result returned by each one of those process. 我可能没有使用正确的方法,但我正在寻找的最终结果是同时运行f 10个进程,并获得每个进程返回的结果。 So I would end with a list of 10 results (which may or may not be identical). 所以我最后会列出10个结果(可能相同也可能不相同)。

The docs on multiprocessing are quite confusing and it's not trivial to figure out which approach I should be taking and it seems to me that f should be run 10 times in the example I provided above. 关于多处理的文档非常令人困惑,要弄清楚我应该采用哪种方法并不容易,而且在我看来f应该在我上面提供的示例中运行10次。

apply_async isn't meant to launch multiple processes; apply_async并不意味着启动多个进程; it's just meant to call the function with the arguments in one of the processes of the pool. 它只是用于在池的一个进程中使用参数调用该函数。 You'll need to make 10 calls if you want the function to be called 10 times. 如果要将函数调用10次,则需要进行10次调用。

First, note the docs on apply() (emphasis added): 首先,请注意apply()上的文档(强调添加):

 apply(func[, args[, kwds]]) 

Call func with arguments args and keyword arguments kwds. 使用参数args和关键字参数kwds调用func。 It blocks until the result is ready. 它会阻塞,直到结果准备就绪。 Given this blocks, apply_async() is better suited for performing work in parallel. 给定此块,apply_async()更适合并行执行工作。 Additionally, func is only executed in one of the workers of the pool. 此外,func仅在池中的一个工作程序中执行。

Now, in the docs for apply_async() : 现在,在apply_async()的文档中:

 apply_async(func[, args[, kwds[, callback[, error_callback]]]]) 

A variant of the apply() method which returns a result object. apply()方法的一种变体,它返回一个结果对象。

The difference between the two is just that apply_async returns immediately. 两者之间的区别仅在于apply_async立即返回。 You can use map() to call a function multiple times, though if you're calling with the same inputs, then it's a little redudant to create the list of the same argument just to have a sequence of the right length. 您可以使用map()多次调用函数,但是如果您使用相同的输入进行调用,那么创建相同参数的列表只是为了获得正确长度的序列,这是一个小的减少。

However, if you're calling different functions with the same input, then you're really just calling a higher order function, and you could do it with map or map_async() like this: 但是,如果您使用相同的输入调用不同的函数,那么您实际上只是调用更高阶函数,并且可以使用mapmap_async()此操作:

multiprocessing.map(lambda f: f(1), functions)

except that lambda functions aren't pickleable, so you'd need to use a defined function (see How to let Pool.map take a lambda function ). 除了lambda函数不是pickleable之外,所以你需要使用一个已定义的函数(参见如何让Pool.map获取lambda函数 )。 You can actually use the builtin apply() (not the multiprocessing one) (although it's deprecated): 你实际上可以使用内置的apply() (而不是多处理的)(尽管它已被弃用):

multiprocessing.map(apply,[(f,1) for f in functions])

It's easy enough to write your own, too: 编写自己的也很容易:

def apply_(f,*args,**kwargs):
  return f(*args,**kwargs)

multiprocessing.map(apply_,[(f,1) for f in functions])

Each time you write pool.apply_async(...) it will delegate that function call to one of the processes that was started in the pool. 每次编写pool.apply_async(...)它都会将该函数调用委托给池中启动的其中一个进程。 If you want to call the function in multiple processes, you need to issue multiple pool.apply_async calls. 如果要在多个进程中调用该函数,则需要发出多个pool.apply_async调用。

Note, there also exists a pool.map (and pool.map_async ) function which will take a function and an iterable of inputs: 注意,还有一个pool.map (和pool.map_async )函数,它将接受一个函数和一个可迭代的输入:

inputs = range(30)
results = pool.map(f, inputs)

These functions will apply the function to each input in the inputs iterable. 这些函数将函数应用于inputs iterable中的每个输入。 It attempts to put "batches" into the pool so that the load gets balanced fairly evenly among all the processes in the pool. 它尝试将“批处理”放入池中,以便负载在池中的所有进程之间相当均衡。

If you want to run a single piece of code in ten processes, each of which then exits, a Pool of ten processes is probably not the right thing to use. 如果要在十个进程中运行单个代码,然后每个进程退出,则十个进程Pool可能不是正确的使用方法。

Instead, create ten Process es to run the code: 相反,创建十个Process es来运行代码:

processes = []

for _ in range(10):
    p = multiprocessing.Process(target=f, args=(1,))
    p.start()
    processes.append(p)

for p in processes:
    p.join()

The multiprocessing.Pool class is designed to handle situations where the number of processes and the number of jobs are unrelated. multiprocessing.Pool类旨在处理进程数和作业数不相关的情况。 Often the number of processes is selected to be the number of CPU cores you have, while the number of jobs is much larger. 通常,进程数量选择为您拥有的CPU核心数,而作业数量则要大得多。 Thanks! 谢谢!

If you aren't committed to Pool for any particular reason, I've written a function around multiprocessing.Process that will probably do the trick for you. 如果你因为任何特殊原因没有致力于Pool,我已经编写了一个围绕multiprocessing.Process的函数,它可能会为你提供帮助。 It's posted here , but I'd be happy to upload the most recent version to github if you want it. 它发布在这里 ,但如果你需要,我很乐意将最新版本上传到github。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 多处理池apply_async - Multiprocessing pool apply_async 为什么我的 multiprocessing.Pool apply_async 只在 for 循环中执行一次 - Why is my multiprocessing.Pool apply_async only executed once inside a for loop Python多处理池apply_async错误 - Python multiprocessing pool apply_async error Python 中的异步多处理与池 apply_async - Asynchronous multiprocessing in Python with pool apply_async multiprocessing.Pool:何时使用 apply、apply_async 或 map? - multiprocessing.Pool: When to use apply, apply_async or map? 如果从函数内部执行,带有“ apply_async”的多处理池不执行任何操作 - Multiprocessing pool with “apply_async” does nothing if executed from inside a function 如何使用 multiprocessing.Pool 判断 apply_async 函数是否已启动或仍在队列中 - How to tell if an apply_async function has started or if it's still in the queue with multiprocessing.Pool 如何将multiprocessing.Pool实例传递给apply_async回调函数? - How to pass multiprocessing.Pool instance to apply_async callback function? multiprocessing.Pool().apply_async() 似乎没有运行我的函数 - multiprocessing.Pool().apply_async() doesn't seem to run my function 如何使用python多处理池apply_async函数在自己的类中分配回调 - How to assign callback inside own class using python multiprocessing pool apply_async function
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM