简体   繁体   中英

How to parallelise .predict() method of a scikit-learn SVM (SVC) Classifier?

I recently came across a requirement that I have a .fit() trained scikit-learn SVC Classifier instance and need to .predict() lots of instances.

Is there a way to parallelise only this .predict() method by any scikit-learn built-in tools?

from sklearn import svm

data_train = [[0,2,3],[1,2,3],[4,2,3]]
targets_train = [0,1,0]

clf = svm.SVC(kernel='rbf', degree=3, C=10, gamma=0.3, probability=True)
clf.fit(data_train, targets_train)

# this can be very large (~ a million records)
to_be_predicted = [[1,3,4]]
clf.predict(to_be_predicted)

If somebody does know a solution, I will be more than happy if you could share it.

This may be buggy, but something like this should do the trick. Basically, break your data into blocks and run your model on each block separately in a joblib.Parallel loop.

from sklearn.externals.joblib import Parallel, delayed

n_cores = 2
n_samples = to_be_predicted.shape[0]
slices = [
    (n_samples*i/n_cores, n_samples*(i+1)/n_cores))
    for i in range(n_cores)
    ]

results = np.vstack( Parallel( n_jobs = n_cores )( 
    delayed(clf.predict)( to_be_predicted[slices[i_core][0]:slices[i_core][1]
    for i_core in range(n_cores)
    ))

Working example from above...

from joblib import Parallel, delayed
from sklearn import svm

data_train = [[0,2,3],[1,2,3],[4,2,3]]
targets_train = [0,1,0]

clf = svm.SVC(kernel='rbf', degree=3, C=10, gamma=0.3, probability=True)
clf.fit(data_train, targets_train)

to_be_predicted = np.array([[1,3,4], [1,3,4], [1,3,5]])
clf.predict(to_be_predicted)

n_cores = 3

parallel = Parallel(n_jobs=n_cores)
results = parallel(delayed(clf.predict)(to_be_predicted[i].reshape(-1,3))
    for i in range(n_cores))

np.vstack(results).flatten()
array([1, 1, 0])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM