简体   繁体   English

RawArray 未被进程修改为 Python 多处理的共享内存

[英]RawArray not modified by processes as shared memory for Python multiprocessing

I am working with python multiprocessing.我正在使用 python 多处理。 Using Pool to start concurrent processes and RawArray to share an array between concurrent processes.使用 Pool 启动并发进程和 RawArray 在并发进程之间共享数组。 I do not need to synchronize the accessing of RawArray, that is, the array can be modified by any processes at any time.我不需要同步RawArray的访问,即数组可以随时被任何进程修改。

The test code for RawArray is: (do not mind the meaning of the program as it is just a test.) RawArray 的测试代码是:(不要在意程序的意思,因为它只是一个测试。)

from multiprocessing.sharedctypes import RawArray

import time

sieve = RawArray('i', (10 + 1)*[1])   # shared memory between processes

import multiprocessing as mp

def foo_pool(x):
        time.sleep(0.2)
        sieve[x] = x*x  # modify the shared memory array. seem not work ?    
        return x*x

result_list = []

def log_result(result):  
        result_list.append(result)

def apply_async_with_callback():
        pool = mp.Pool(processes = 4)    
        for i in range(10):
            pool.apply_async(foo_pool, args = (i,), callback = log_result)
        pool.close()
        pool.join()
        print(result_list)  

        for x in sieve:
            print (x)        # !!! sieve is [1, 1, ..., 1]

if __name__ == '__main__':
        apply_async_with_callback()

While the code did not work as expected.虽然代码没有按预期工作。 I commented the key statements.我评论了关键陈述。 I have got stuck on this for a whole day.我一整天都被困在这个问题上。 Any help or constructive advices would be very appreciated.任何帮助或建设性的建议将不胜感激。

  • time.sleep fails because you did not import time time.sleep失败,因为您没有import time
  • use sieve[x] = x*x to modify the array instead of sieve[x].value = x*x使用sieve[x] = x*x来修改数组而不是sieve[x].value = x*x
  • on Windows, your code creates a new sieve in each subprocess.在 Windows 上,您的代码在每个子进程中创建一个新sieve You need to pass a reference to the shared array, for example like this:您需要传递对共享数组的引用,例如:

     def foo_init(s): global sieve sieve = s def apply_async_with_callback(): pool = mp.Pool(processes = 4, initializer=foo_init, initargs=(sieve,)) if __name__ == '__main__': sieve = RawArray('i', (10 + 1)*[1])

You should use multithreading instead of multiprocessing, as threads can share memory of main process natively.您应该使用多线程而不是多处理,因为线程可以本机共享主进程的内存。

If you worry about python's GIL mechanism, maybe you can resort to the nogil of numba .如果你担心 python 的 GIL 机制,也许你可以求助于numbanogil

Working version:工作版本:

from multiprocessing import Pool, RawArray
import time


def foo_pool(x):
    sieve[x] = x * x  # modify the shared memory array.


def foo_init(s):
    global sieve
    sieve = s


def apply_async_with_callback(loc_size):
    with Pool(processes=4, initializer=foo_init, initargs=(sieve,)) as pool:
        pool.map(foo_pool, range(loc_size))

    for x in sieve:
        print(x)


if __name__ == '__main__':
    size = 50
    sieve = RawArray('i', size * [1])  # shared memory between processes
    apply_async_with_callback(size)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM