简体   繁体   中英

Multiprocessing in Python - how to use it in these loops?

The code below handles a huge amount of data and I want to ask how can I use the multiprocessing module in Python for parallel processing in order to speed things up. Any help is appreciated

pats = []
for chunk in code_counter(patients, codes):
    pats.append(chunk)

def code_counter(patients, codes):
    for key, group in itertools.groupby(patients, key=operator.itemgetter('ID')):
        group_codes = [item['CODE'] for item in group]
        yield [group_codes.count(code) for code in codes]

I think your problem resides in the use of yield. I think you can't yield the data from different processes. I understood, that you use the yield cuz you can't load the data "inline" that would cause the ram to overload.

maybe you can take a look at the multiprocessing Queue http://docs.python.org/2/library/multiprocessing.html#exchanging-objects-between-processes

i didn't really get what you are trying to do with your code, so i can't deliver a precise excample.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM