简体   繁体   English

在python中对具有多个参数的函数进行多处理

[英]Multiprocess a function in python that got multiple parameters

I'm trying to use the multiprocessing library in python but I met some difficulties: 我正在尝试在python中使用多处理库,但遇到了一些困难:

def request_solr(limit=10, offset=10):
    # build my facets here using limit and offset
    # request solr
    return response.json()

def get_list_event_per_user_per_mpm(limit=100):
    nb_unique_user = get_unique_user()
    print "Unique user: ", nb_unique_user
    processor_pool = multiprocessing.Pool(4)
    offset = range(0, nb_unique_user, limit)
    list_event_per_user = processor_pool.map(request_solr(limit), offset)
    return list_event_per_user

I'm not sure how to pass the second parameters into the function. 我不确定如何将第二个参数传递给函数。 How can I make it work. 我该如何运作。 I've got the following error: 我遇到以下错误:

TypeError: 'dict' object is not callable

you see that error because you are calling the function before passing it to multiprocessing. 您会看到该错误,因为在将函数传递给多处理之前正在调用该函数。

I suggest you use starmap in combination with itertools.repeat : 我建议您将starmapitertools.repeat结合使用:

import itertools as it

# rest of your code

processor_pool = multiprocessing.Pool(4)
offset = range(0, nb_unique_user, limit)
list_event_per_user = processor_pool.starmap(request_solr, zip(it.repeat(limit), offset))

Starmap will call your function expanding the pair of values into two arguments. Starmap将调用您的函数,将值对扩展为两个参数。 The repeat(limit) simply produces an iterable that has all elements equal to limit . repeat(limit)只是产生一个迭代器,该迭代器的所有元素都等于limit

This can work for any number of arguments: 这可以用于任何数量的参数:

def my_function(a, b, c, d, e):
    return a+b+c+d+e

pool = Pool()
pool.starmap(my_function, [(1,2,3,4,5)])   # calls my_function(1,2,3,4,5)

Since you are using an old version of python you have to work around this by either modifying your function or using a wrapper function: 由于您使用的是旧版本的python,因此必须通过修改函数或使用包装函数来解决此问题:

def wrapper(arguments):
    return request_solr(*arguments)

# later:

pool.map(wrapper, zip(repeat(limit), offset))

You need to use a lambda for this. 您需要为此使用lambda。 The way you're doing it right now, it's trying to map the result of request_solr as a function with offset as the argument. 您现在的操作方式是,尝试将request_solr的结果映射为以offset为参数的函数。

This should do the trick. 这应该可以解决问题。

processor_pool.map(lambda x: request_solr(limit, x), offset)

Note, this only works in 3.x. 请注意,这仅适用于3.x。 In 2.x you need to create a function object. 在2.x中,您需要创建一个函数对象。 For example: 例如:

class RequestSolrCaller:
    def __init__(self, limit)
        self.limit = limit
    def __call__(self, offset)
        return request_solr(self.limit, offset)

processor_pool.map(RequestSolrCaller(limit), offset)

I used to use a generator to produce the keywords. 我曾经使用生成器来生成关键字。 This is the content a my simple_multiproc.py. 这是我的simple_multiproc.py的内容。

Note the important of having request_solr at level module. 请注意在级别模块上具有request_solr的重要性。

import multiprocessing

MAX=5

def _get_pool_args(**kw):
    for _ in range(MAX):
        r = {"limit": 10, "offset": 10}
        r.update(kw)
        yield r


def request_solr(limit=10, offset=10):
    # build my facets here using limit and offset
    # request solr
    print(locals())
    response.json()

if __name__ == "__main__":
    pool = multiprocessing.Pool(MAX)
    pool.map(request_solr, _get_pool_args())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM