use multiprocessing to implement a function in python

Question

I am using a function that take too much time to finish since it takes a large input and use two nested for loops .

The code of the function :

def transform(self, X):
        global brands
        result=[]
        for x in X:
            index=0
            count=0
            for brand in brands:
                all_matches= re.findall(re.escape(brand), x,flags=re.I)
                count_all_match=len(all_matches)
                if(count_all_match>count):
                    count=count_all_match
                    index=brands.index(brand)

            result.append([index])
        return np.array(result)

So how to change the code of this function so that it uses multiprocessing in order to optimize the running time ?

Answer 1

I don't see the use of self in the method transform . So i made a common function.

import re
import numpy as np

from concurrent.futures import ProcessPoolExecutor

def transformer(x):

    global brands

    index = 0
    count = 0

    for brand in brands:

        all_matches = re.findall(re.escape(brand), x, flags=re.I)

        count_all_match = len(all_matches)

        if count_all_match > count:

            count = count_all_match

            index = brands.index(brand)

    return [index]

def transform(X):

    with ProcessPoolExecutor() as executor:
        result = executor.map(transformer, X)

    return np.array(list(result))

use multiprocessing to implement a function in python

Question

1 answers

solution1
0 ACCPTED 2017-08-21 10:36:37

use multiprocessing to implement a function in python

Question

1 answers

solution1 0 ACCPTED 2017-08-21 10:36:37

solution1
0 ACCPTED 2017-08-21 10:36:37