简体   繁体   English

python multiprocessing函数返回的多个输出

[英]multiple output returned from python multiprocessing function

I am trying to use multiprocessing to return a list, but instead of waiting until all processes are done, I get several returns from one return statement in mp_factorizer, like this: 我试图使用多重处理来返回列表,但是我没有等到所有进程都完成,而是从mp_factorizer中的一个return语句中获得了多个返回,如下所示:

None
None
(returns list)

in this example I used 2 threads. 在此示例中,我使用了2个线程。 If I used 5 threads, there would be 5 None returns before the list is being put out. 如果我使用了5个线程,则在列出该列表之前将有5个None返回。 Here is the code: 这是代码:

def mp_factorizer(nums, nprocs, objecttouse):
    if __name__ == '__main__':
        out_q = multiprocessing.Queue()
        chunksize = int(math.ceil(len(nums) / float(nprocs)))
        procs = []
        for i in range(nprocs):
            p = multiprocessing.Process(
                    target=worker,                   
                    args=(nums[chunksize * i:chunksize * (i + 1)],
                          out_q,
                    objecttouse))
            procs.append(p)
            p.start()

        # Collect all results into a single result dict. We know how many dicts
        # with results to expect.
        resultlist = []
        for i in range(nprocs):
            temp=out_q.get()
            index =0
            for i in temp:
                resultlist.append(temp[index][0][0:])
                index +=1

        # Wait for all worker processes to finish
        for p in procs:
            p.join()
            resultlist2 = [x for x in resultlist if x != []]
        return resultlist2

def worker(nums, out_q, objecttouse):
    """ The worker function, invoked in a process. 'nums' is a
        list of numbers to factor. The results are placed in
        a dictionary that's pushed to a queue.
    """
    outlist = []
    for n in nums:        
        outputlist=objecttouse.getevents(n)
        if outputlist:
            outlist.append(outputlist)   
    out_q.put(outlist)

mp_factorizer gets a list of items, # of threads, and an object that the worker should use, it then splits up the list of items so all threads get an equal amount of the list, and starts the workers. mp_factorizer获取项目列表,线程数和工作线程应使用的对象,然后拆分项目列表,以便所有线程获得相等数量的列表,并启动工作线程。 The workers then use the object to calculate something from the given list, add the result to the queue. 然后,工作人员使用该对象从给定列表中计算出一些内容,然后将结果添加到队列中。 Mp_factorizer is supposed to collect all results from the queue, merge them to one large list and return that list. Mp_factorizer应该从队列中收集所有结果,将它们合并到一个大列表中并返回该列表。 However - I get multiple returns. 但是-我得到了很多回报。

What am I doing wrong? 我究竟做错了什么? Or is this expected behavior due to the strange way windows handles multiprocessing? 还是由于Windows处理多重处理的奇怪方式导致这种预期的行为? (Python 2.7.3, Windows7 64bit) (Python 2.7.3,Windows7 64位)

EDIT: The problem was the wrong placement of if __name__ == '__main__': . 编辑:问题是if __name__ == '__main__':位置错误。 I found out while working on another problem, see using multiprocessing in a sub process for a complete explanation. 我在解决另一个问题时发现了问题,请参阅在子过程中使用多重处理以获取完整说明。

if __name__ == '__main__' is in the wrong place. if __name__ == '__main__'放在错误的位置。 A quick fix would be to protect only the call to mp_factorizer like Janne Karila suggested: 一种快速的解决方案是仅保护对mp_factorizer的调用,如Janne Karila建议的那样:

if __name__ == '__main__':
    print mp_factorizer(list, 2, someobject)

However, on windows the main file will be executed once on execution + once for every worker thread, in this case 2. So this would be a total of 3 executions of the main thread, excluding the protected part of the code. 但是,在Windows上,主文件将在执行时执行一次,并且每个工作线程执行一次,在这种情况下为2。因此,这将是主线程的总共3次执行,不包括代码的受保护部分。

This can cause problems as soon as there are other computations being made in the same main thread, and at the very least unnecessarily slow down performance. 在同一主线程中进行其他计算时,这可能会导致问题,并且至少会不必要地降低性能。 Even though only the worker function should be executed several times, in windows everything will be executed thats not protected by if __name__ == '__main__' . 即使仅应多次执行worker函数,在Windows中, if __name__ == '__main__'if __name__ == '__main__'所有内容都将受到保护。

So the solution would be to protect the whole main process by executing all code only after if __name__ == '__main__' . 因此,解决方案是仅在if __name__ == '__main__'之后执行所有代码,以保护整个主进程。

If the worker function is in the same file , however, it needs to be excluded from this if statement because otherwise it can not be called several times for multiprocessing. 但是,如果worker函数在同一文件中,则需要将该函数从此if语句中排除,因为否则无法多次调用它进行多重处理。

Pseudocode main thread: 伪代码主线程:

# Import stuff
if __name__ == '__main__':
    #execute whatever you want, it will only be executed 
    #as often as you intend it to
    #execute the function that starts multiprocessing, 
    #in this case mp_factorizer()
    #there is no worker function code here, it's in another file.

Even though the whole main process is protected, the worker function can still be started, as long as it is in another file. 即使整个主进程都受到保护,只要它在另一个文件中,都仍可以启动worker函数。

Pseudocode main thread, with worker function: 伪代码主线程,具有辅助函数:

# Import stuff
#If the worker code is in the main thread, exclude it from the if statement:
def worker():
    #worker code
if __name__ == '__main__':
    #execute whatever you want, it will only be executed 
    #as often as you intend it to
    #execute the function that starts multiprocessing, 
    #in this case mp_factorizer()
#All code outside of the if statement will be executed multiple times
#depending on the # of assigned worker threads.

For a longer explanation with runnable code, see using multiprocessing in a sub process 有关可运行代码的详细说明,请参见在子流程中使用多重处理

Your if __name__ == '__main__' statement is in the wrong place. 您的if __name__ == '__main__'语句放置在错误的位置。 Put it around the print statement to prevent the subprocesses from executing that line: 将其放在print语句周围,以防止子进程执行该行:

if __name__ == '__main__':
    print mp_factorizer(list, 2, someobject)

Now you have the if inside mp_factorizer , which makes the function return None when called inside a subprocess. 现在,您在mp_factorizer内部拥有了if ,这使得该函数在子mp_factorizer内部被调用时返回None

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python对具有多个函数的函数进行多处理 - python multiprocessing to a function with multiple functions 关于使用 python 多处理获取从目标 function 返回的 numpy 数组的建议? - Suggestions on using python multiprocessing to get numpy array returned from target function? 使用python连接多处理函数的输出 - Concatenate the output of multiprocessing function using python 如何使用python打印新行中函数返回的输出? - how to print the output returned from a function in new lines using python? 不能利用多处理共享内存来保存多处理函数的输出 - Cannot utilise multiprocessing shared memory to save output from multiprocessing function Python从另一个多重处理函数调用多重处理函数。 - Python call a multiprocessing function from another multiprocessing function. 如何使用多重处理从多个函数获取返回的数据 - How to get returned data from multiple functions using multiprocessing 在Python 2.7中使用多个参数进行多处理以实现功能 - Multiprocessing with multiple arguments to function in Python 2.7 具有多个参数和 void 函数的 Python 多处理池 - Python Multiprocessing pool with multiple arguments and void function Python 多处理问题关于 function 多个 arguments 的迭代 - Python multiprocessing question on iteration of a function multiple arguments
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM