简体   繁体   中英

use multiprocessing to implement a function in python

I am using a function that take too much time to finish since it takes a large input and use two nested for loops .

The code of the function :

def transform(self, X):
        global brands
        result=[]
        for x in X:
            index=0
            count=0
            for brand in brands:
                all_matches= re.findall(re.escape(brand), x,flags=re.I)
                count_all_match=len(all_matches)
                if(count_all_match>count):
                    count=count_all_match
                    index=brands.index(brand)

            result.append([index])
        return np.array(result)

So how to change the code of this function so that it uses multiprocessing in order to optimize the running time ?

I don't see the use of self in the method transform . So i made a common function.

import re
import numpy as np

from concurrent.futures import ProcessPoolExecutor

def transformer(x):

    global brands

    index = 0
    count = 0

    for brand in brands:

        all_matches = re.findall(re.escape(brand), x, flags=re.I)

        count_all_match = len(all_matches)

        if count_all_match > count:

            count = count_all_match

            index = brands.index(brand)

    return [index]

def transform(X):

    with ProcessPoolExecutor() as executor:
        result = executor.map(transformer, X)

    return np.array(list(result))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM