[英]Multiprocess a function in python that got multiple parameters
I'm trying to use the multiprocessing library in python but I met some difficulties: 我正在尝试在python中使用多处理库,但遇到了一些困难:
def request_solr(limit=10, offset=10):
# build my facets here using limit and offset
# request solr
return response.json()
def get_list_event_per_user_per_mpm(limit=100):
nb_unique_user = get_unique_user()
print "Unique user: ", nb_unique_user
processor_pool = multiprocessing.Pool(4)
offset = range(0, nb_unique_user, limit)
list_event_per_user = processor_pool.map(request_solr(limit), offset)
return list_event_per_user
I'm not sure how to pass the second parameters into the function. 我不确定如何将第二个参数传递给函数。 How can I make it work.
我该如何运作。 I've got the following error:
我遇到以下错误:
TypeError: 'dict' object is not callable
you see that error because you are calling the function before passing it to multiprocessing. 您会看到该错误,因为在将函数传递给多处理之前正在调用该函数。
I suggest you use starmap
in combination with itertools.repeat
: 我建议您将
starmap
与itertools.repeat
结合使用:
import itertools as it
# rest of your code
processor_pool = multiprocessing.Pool(4)
offset = range(0, nb_unique_user, limit)
list_event_per_user = processor_pool.starmap(request_solr, zip(it.repeat(limit), offset))
Starmap will call your function expanding the pair of values into two arguments. Starmap将调用您的函数,将值对扩展为两个参数。 The
repeat(limit)
simply produces an iterable that has all elements equal to limit
. repeat(limit)
只是产生一个迭代器,该迭代器的所有元素都等于limit
。
This can work for any number of arguments: 这可以用于任何数量的参数:
def my_function(a, b, c, d, e):
return a+b+c+d+e
pool = Pool()
pool.starmap(my_function, [(1,2,3,4,5)]) # calls my_function(1,2,3,4,5)
Since you are using an old version of python you have to work around this by either modifying your function or using a wrapper function: 由于您使用的是旧版本的python,因此必须通过修改函数或使用包装函数来解决此问题:
def wrapper(arguments):
return request_solr(*arguments)
# later:
pool.map(wrapper, zip(repeat(limit), offset))
You need to use a lambda for this. 您需要为此使用lambda。 The way you're doing it right now, it's trying to map the result of
request_solr
as a function with offset
as the argument. 您现在的操作方式是,尝试将
request_solr
的结果映射为以offset
为参数的函数。
This should do the trick. 这应该可以解决问题。
processor_pool.map(lambda x: request_solr(limit, x), offset)
Note, this only works in 3.x. 请注意,这仅适用于3.x。 In 2.x you need to create a function object.
在2.x中,您需要创建一个函数对象。 For example:
例如:
class RequestSolrCaller:
def __init__(self, limit)
self.limit = limit
def __call__(self, offset)
return request_solr(self.limit, offset)
processor_pool.map(RequestSolrCaller(limit), offset)
I used to use a generator to produce the keywords. 我曾经使用生成器来生成关键字。 This is the content a my simple_multiproc.py.
这是我的simple_multiproc.py的内容。
Note the important of having request_solr at level module. 请注意在级别模块上具有request_solr的重要性。
import multiprocessing
MAX=5
def _get_pool_args(**kw):
for _ in range(MAX):
r = {"limit": 10, "offset": 10}
r.update(kw)
yield r
def request_solr(limit=10, offset=10):
# build my facets here using limit and offset
# request solr
print(locals())
response.json()
if __name__ == "__main__":
pool = multiprocessing.Pool(MAX)
pool.map(request_solr, _get_pool_args())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.