简体   繁体   中英

Python sklearn and multiprocessing

I'm trying to parallelise training of classifiers from sklearn (gaussian mixture modell in this case) using multiprocessing and I get a lot worse classifiers in comparison with running them sequentially. Additionally each time after training the results are different as if the code was not thread safe. Can anyone explain me what is going on? Here is the code and at the end the thread function:

nrProc = 8
semaphore = Semaphore(nrProc)
m = Manager()
models = m.list()
modelsOut = m.list()
processes = []   

cnt = 0                
for event_label in data_positive:                        
    models.append(mixture.GMM(**classifier_params))  
    models.append(mixture.GMM(**classifier_params))

for event_label in data_positive:
    if classifier_method == 'gmm':                        
        processes.append(Process(target=trainProcess, args=(models[cnt], data_positive[event_label], semaphore, modelsOut)))
        cnt = cnt + 1                        
        processes.append(Process(target=trainProcess, args=(models[cnt], data_negative[event_label], semaphore, modelsOut)))
        cnt = cnt + 1
    else:
        raise ValueError("Unknown classifier method ["+classifier_method+"]")

for proc in processes:
    proc.start()

for proc in processes:
    proc.join()


cnt = 0                
for event_label in data_positive:
    model_container['models'][event_label] = {}
    model_container['models'][event_label]['positive'] = modelsOut[cnt]
    cnt = cnt + 1
    model_container['models'][event_label]['negative'] = modelsOut[cnt]
    cnt = cnt + 1

def trainProcess(model, data, semaphore, modelsOut):
    semaphore.acquire()    
    modelsOut.append(model.fit(data))
    semaphore.release()
    return 0

因此,解决方案是使用sklearn的克隆函数,该函数对估计器进行深层复制。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM