简体   繁体   中英

How to share array of objects in Python

I have a function in which I create a pool of processes. More over I use multiprocessing.Value() and multiprocessing.Lock() in order to manage some shared values between processes.

I want to do the same thing with an array of objects in order to share it between processes but I don't know how to do it. I will only read from that array.

This is the function:

from multiprocessing import Value,Pool,Lock,cpu_count

def predict(matches_path, unknown_path, files_path, imtodetect_path, num_query_photos, use_top3, uid, workbook, excel_file_path,modelspath,email_address):
 
    shared_correct_matched_imgs = Value('i', 0)
    shared_unknown_matched_imgs = Value('i', 0)
    shared_tot_imgs = Value('i', 0)
    counter = Value('i', 0)
    shared_lock = Lock()
    num_workers = cpu_count()
    
    feature = load_feature(modelspath)
    
    pool = Pool(initializer=init_globals,
                initargs=[counter, shared_tot_imgs, shared_correct_matched_imgs, shared_unknown_matched_imgs,
                          shared_lock], processes=num_workers)
    
    for img in glob.glob(os.path.join(imtodetect_path, '*g')):
        pool.apply_async(predict_single_img, (img,imtodetect_path,excel_file_path,files_path,use_top3,uid,matches_path,unknown_path,num_query_photos,index,modelspath))
        index+=increment
    
    pool.close()
    pool.join()

The array is created with the instruction feature = load_feature(modelspath) . This is the array that I want to share.

In init_globals I inizialize the shared value:

def init_globals(counter, shared_tot_imgs, shared_correct_matched_imgs, shared_unknown_matched_imgs, shared_lock):
    global cnt, tot_imgs, correct_matched_imgs, unknown_matched_imgs, lock
    cnt = counter
    tot_imgs = shared_tot_imgs
    correct_matched_imgs = shared_correct_matched_imgs
    unknown_matched_imgs = shared_unknown_matched_imgs
    lock = shared_lock

The easy way of providing shared static data is simply to make it a global variable accessible to the function you want to call. If you're using an operating system which supports "fork", it is very straightforward to use global variables in child processes as long as they're constant (if you modify them, changes won't be reflected in the other processes)

import multiprocessing as mp
from random import randint

shared = ['some', 'shared', 'data', f'{randint(0,1e6)}']

def foo():
    print(' '.join(shared))

if __name__ == "__main__":
    mp.set_start_method("fork")
    #defining "shared" here would be valid also
    p = mp.Process(target=foo)
    p.start()
    p.join()
    
    print(' '.join(shared)) #same random number means "shared" is same object

This won't work when using "spawn" as the start method (the only one available on windows), because the memory of the parent is not shared in any way with the child, so the child must "import" the main file to gain access to whatever the target function is (this is also why you can run into problems with decorators.) If you define your data outside the if __name__ == "__main__": block, it will kinda work, but you will have made separate copies of the data, which can be undesirable if it's big, slow to create, or can change each time it's created.

import multiprocessing as mp
from random import randint

shared = ['some', 'shared', 'data', f'{randint(0,1e6)}']

def foo():
    print(' '.join(shared))

if __name__ == "__main__":
    mp.set_start_method("spawn")

    p = mp.Process(target=foo)
    p.start()
    p.join()
    
    print(' '.join(shared)) #different number means different copy of "shared" (1 a million chance of being same i guess...)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM