简体   繁体   English

如何在 python 中使用多处理从 append 到 class 变量?

[英]How can I append to class variables using multiprocessing in python?

I have this program where everything is built in a class object.我有这个程序,其中所有内容都内置在 class object 中。 There is a function that does 50 computations of a another function, each with a different input, so I decided to use multiprocessing to speed it up.有一个 function 对另一个 function 进行 50 次计算,每个都有不同的输入,所以我决定使用多处理来加速它。 However, the list that needs to be returned in the end always returns empty.但是,最后需要返回的列表总是返回空。 any ideas?有任何想法吗? Here is a simplified version of my problem.这是我的问题的简化版本。 The output of main_function() should be a list containing the numbers 0-9, however the list returns empty. main_function() 的 output 应该是一个包含数字 0-9 的列表,但是该列表返回空。

class MyClass(object):
    def __init__(self):
        self.arr = list()

    def helper_function(self, n):
        self.arr.append(n)

    def main_function(self):
        jobs = []

        for i in range(0,10):
            p = multiprocessing.Process(target=self.helper_function, args=(i,))
            jobs.append(p)
            p.start()

        for job in jobs:
            jobs.join()

        print(self.arr)

arr is a list that's not going to be shared across subprocess instances. arr是一个不会在子流程实例之间共享的list

For that you have to use a Manager object to create a managed list that is aware of the fact that it's shared between processes.为此,您必须使用Manager object 创建一个托管列表,该列表知道它在进程之间共享的事实。

The key is:关键是:

self.arr = multiprocessing.Manager().list()

full working example:完整的工作示例:

import multiprocessing

class MyClass(object):
    def __init__(self):
        self.arr = multiprocessing.Manager().list()

    def helper_function(self, n):
        self.arr.append(n)

    def main_function(self):
        jobs = []

        for i in range(0,10):
            p = multiprocessing.Process(target=self.helper_function, args=(i,))
            jobs.append(p)
            p.start()

        for job in jobs:
            job.join()

        print(self.arr)

if __name__ == "__main__":
    a = MyClass()
    a.main_function()

this code now prints: [7, 9, 2, 8, 6, 0, 4, 3, 1, 5]此代码现在打印: [7, 9, 2, 8, 6, 0, 4, 3, 1, 5]

multiprocessing is touchy.多处理是敏感的。

For simple multiprocessing tasks, I would recomend:对于简单的多处理任务,我建议:

from multiprocessing.dummy import Pool as ThreadPool


class MyClass(object):
    def __init__(self):
        self.arr = list()

    def helper_function(self, n):
        self.arr.append(n)

    def main_function(self):
        pool = ThreadPool(4)
        pool.map(self.helper_function, range(10))
        print(self.arr)


if __name__ == '__main__':
    c = MyClass()
    c.main_function()

The idea of using map instead of complicated multithreading calls is from one of my favorite blog posts: https://chriskiehl.com/article/parallelism-in-one-line使用 map 而不是复杂的多线程调用的想法来自我最喜欢的博客文章之一: https://chriskiehl.com/article/parallelism-in-one-line

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM