I have a function that I map multiple worker threads to execute. I would like each thread to maintain its own dictionary to write results from the function and finally, individually write the contents of each dictionary to seperate files. I would like to know how to accomplish this in Python. I found no clear way to assign an object to a single thread. The documentation is only covering information related to sharing memory between threads (with Manager objects). Following is the code I am using for the job (except for the current_thread_dict in worker, currently I use a Manager.dict which every thread use):
from multiprocessing import Pool
def worker(row):
#add items to current_thread_dict
with Pool(processes=16) as p:
results = p.map(worker, rows, chunksize=1)
Additionally to the row you could pass an index to each worker so that it knows in which file it has to save the dict. I implemented a simple example as follows:
from multiprocessing import Pool
import json
def worker(args):
i, row = args
# create object
current_thread_dict = {
'row': row,
'ind': i,
}
# add stuff to dict
if not row.startswith('data'):
current_thread_dict['special'] = row.upper()
else:
current_thread_dict['special'] = None
current_thread_dict['summary'] = '{row} - {ind} (special: {special})'\
.format(**current_thread_dict)
# save dict
with open(f'output{i}.json', 'w') as f:
json.dump(current_thread_dict, f, indent=2)
def main():
rows = ['data1', 'wow', 'data3', 'banana']
with Pool(processes=16) as p:
results = p.map(worker, enumerate(rows), chunksize=1)
print('results:', results)
if __name__ == '__main__':
main()
I hope this was helpful!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.