Multiprocessing in Python - how to use it in these loops?

Question

The code below handles a huge amount of data and I want to ask how can I use the multiprocessing module in Python for parallel processing in order to speed things up. Any help is appreciated

pats = []
for chunk in code_counter(patients, codes):
    pats.append(chunk)

def code_counter(patients, codes):
    for key, group in itertools.groupby(patients, key=operator.itemgetter('ID')):
        group_codes = [item['CODE'] for item in group]
        yield [group_codes.count(code) for code in codes]

Answer 1

I think your problem resides in the use of yield. I think you can't yield the data from different processes. I understood, that you use the yield cuz you can't load the data "inline" that would cause the ram to overload.

maybe you can take a look at the multiprocessing Queue http://docs.python.org/2/library/multiprocessing.html#exchanging-objects-between-processes

i didn't really get what you are trying to do with your code, so i can't deliver a precise excample.

Multiprocessing in Python - how to use it in these loops?

Question

1 answers

solution1
1 2013-07-13 19:18:54

Multiprocessing in Python - how to use it in these loops?

Question

1 answers

solution1 1 2013-07-13 19:18:54

solution1
1 2013-07-13 19:18:54