简体   繁体   English

如何并行循环以更改 Python 字典中的对象?

[英]How to loop in parallel to alter objects inside a dictionary in Python?

Here is a minimum example of my problem:这是我的问题的最小示例:

import concurrent.futures
from functools import partial

# Object class
class obj:
    def __init__(self,tup):
        self.tup = tup

# Function includes attributes in objects of class above
def new(fi,fdic):
    fdic[fi].new = 'work_'+ str(fdic[fi].tup)
    
# Dictionary full of instances of obj above    
dic = {'a':obj(1),
       'b':obj(2),
       'c':obj(3),
       'd':obj(4),
       'e':obj(5),
       'f':obj(6),
      }

partial_new = partial(new, fdic=dic)

Now I want to multiprocess all the objects in the dictionary (because I have too many in reality).现在我想对字典中的所有对象进行多处理(因为我实际上有太多)。 The code below runs.下面的代码运行。 But it does not "work", because I actually need ProcessPool (I think? Because I want to process things in parallel).但它不“工作”,因为我实际上需要 ProcessPool (我认为?因为我想并行处理事物)。

with concurrent.futures.ThreadPoolExecutor() as executor:
    for _ in executor.map(partial_new, dic.keys()):
        pass
print(dic['b'].new)

This one does not run:这个不运行:

with concurrent.futures.ProcessPoolExecutor() as executor:
    for _ in executor.map(partial_new, dic.keys()):
        pass
print(dic['b'].new)

My question is: How do I make this work?我的问题是:我如何使这项工作?

I just need to use the function to modify all the objects inside the dictionary in parallel.我只需要使用该函数并行修改字典中的所有对象。 Later I wills save the full dictionary, but the function that I apply does not return anything (if this makes things easier).稍后我将保存完整的字典,但我应用的函数不返回任何内容(如果这使事情变得更容易)。

You can use ThreadPool from the multiprocess module in the following way:您可以通过以下方式使用multiprocess模块中的ThreadPool

  1. Create a list of the dict keys ( ls = [a for a in dict.keys() )创建一个字典键列表( ls = [a for a in dict.keys()
  2. Define a function that given a pointer to a dict and a key does the alteration you desire定义一个函数,该函数给出一个指向 dict 的指针,一个键可以进行您想要的更改
  3. use ThreadPool 's starmap() method to run that function on the list you created and the dict使用ThreadPoolstarmap()方法在您创建的列表和字典上运行该函数
  4. join and close the thread pool加入和关闭线程池

Is the issue that it takes a long time to calculate the new value?问题是计算新值需要很长时间吗?

def get_new_value(dictionary_item):
    key, value = dictionary_item
    return key, 'work_' + str(value.tup)

with concurrent.futures.ProcessPoolExecutor() as executor:
    for key, new_value in executor.map(get_new_value, dic.items()):
        dic[key].new = new_value

You can only have one thread modifying dic.您只能有一个线程修改 dic。 But you can pass key and value to a thread, have the thread return the key and the new value, and then have the original thread do the work of updating the dictionary.但是您可以将键和值传递给线程,让线程返回键和新值,然后让原始线程完成更新字典的工作。

You'll probably want to specify a chunksize to map您可能想要指定要map

=== edited === === 编辑 ===

As promised, my complete file.正如所承诺的,我的完整档案。

import concurrent.futures

# Object class
class obj:
    def __init__(self, tup):
        self.tup = tup


# Dictionary full of instances of obj above
dic = {'a': obj(1),
       'b': obj(2),
       'c': obj(3),
       'd': obj(4),
       'e': obj(5),
       'f': obj(6),
       }


def get_new_value(dictionary_item):
    key, value = dictionary_item
    return key, 'work_' + str(value.tup)

def go():
  with concurrent.futures.ProcessPoolExecutor() as executor:
    for key, new_value in executor.map(get_new_value, dic.items()):
        dic[key].new = new_value
  # Make sure it really worked!
  for key, value in dic.items():
      print(key, value.new)


if __name__ == '__main__':
    go()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM