[英]Python: using map and multiprocessing
I'm trying to write a function that can take two arguments and then add it to multiprocessing.Pool
and parallelize it. 我正在尝试编写一个可以接受两个参数的函数,然后将其添加到multiprocessing.Pool
并并行化它。 I had some complications when I tried to write this simple function. 当我尝试编写这个简单的函数时,我遇到了一些复杂问题。
df = pd.DataFrame()
df['ind'] = [111, 222, 333, 444, 555, 666, 777, 888]
df['ind1'] = [111, 444, 222, 555, 777, 333, 666, 777]
def mult(elem1, elem2):
return elem1 * elem2
if __name__ == '__main__':
pool = Pool(processes=4)
print(pool.map(mult, df.ind.astype(int).values.tolist(), df.ind1.astype(int).values.tolist()))
pool.terminate()
It's returning an error: 它返回一个错误:
TypeError: unsupported operand type(s) for //: 'int' and 'list'
I can't understand what's wrong. 我无法理解什么是错的。 Can anybody explain what this error means and how I can fix it? 任何人都可以解释这个错误的含义以及我如何解决它?
The multi-process Pool module takes in a list of the arguments that you want to multi-process, and only supports taking in one argument. 多进程池模块接收您要多处理的参数列表,并且仅支持接受一个参数。 You can fix this by doing the following: 您可以通过执行以下操作来解决此问题:
from multiprocessing import Pool
import pandas as pd
df = pd.DataFrame()
df['ind'] = [111, 222, 333, 444, 555, 666, 777, 888]
df['ind1'] = [111, 444, 222, 555, 777, 333, 666, 777]
def mult(elements):
elem1,elem2 = elements
return elem1 * elem2
if __name__ == '__main__':
pool = Pool(processes=4)
inputs = zip(df.ind.astype(int).values.tolist(), df.ind1.astype(int).values.tolist())
print(pool.map(mult, inputs))
pool.terminate()
What I've done here is zip your two iterables into a list with each element being the two arguments that you wanted to input. 我在这里做的是将你的两个iterables压缩成一个列表,每个元素是你想要输入的两个参数。 Now, I change the input of your function to unpack those arguments so that they can be processed. 现在,我更改函数的输入以解压缩这些参数,以便可以处理它们。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.