简体   繁体   中英

How to deal with convergence warning when using LinearSVC in sklearn?

I got a convergence warning using linear support vector machine in Scikit learn with breast cancer data.

Below is the code:

from sklearn.svm import LinearSVC
from sklearn.datasets import load_breast_cancer

cancer = load_breast_cancer()
(X_cancer, y_cancer) = load_breast_cancer(return_X_y = True)
X_train, X_test, y_train, y_test = train_test_split(X_cancer, y_cancer, random_state = 0)

clf = LinearSVC(max_iter=700000).fit(X_train, y_train)
print('Breast cancer dataset')
print('Accuracy of Linear SVC classifier on training set: {:.2f}'
     .format(clf.score(X_train, y_train)))
print('Accuracy of Linear SVC classifier on test set: {:.2f}'
     .format(clf.score(X_test, y_test)))

Even with ultra high number of iterations, I still got Convergence Warning:

ConvergenceWarning: Liblinear failed to converge, increase the number of iterations. warnings.warn("Liblinear failed to converge, increase "

Can any one explain why it cannot converge? And, generally, can I just ignore Convergence warning, or do I need to further tune the model?

Thank you very much!

svm methods are distance based, and your columns are on different scales. so it makes sense to scale the data first before fitting the model. See more at post such as this or this

So if we do it again with scaling:

from sklearn.svm import LinearSVC
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

cancer = load_breast_cancer()
(X_cancer, y_cancer) = load_breast_cancer(return_X_y = True)
X_cancer = StandardScaler().fit_transform(X_cancer)

X_train, X_test, y_train, y_test = train_test_split(X_cancer, y_cancer, random_state = 0)

clf = LinearSVC().fit(X_train, y_train)

You get a pretty good accuracy without the convergence issue:

print('Accuracy of Linear SVC classifier on training set: {:.2f}'
     .format(clf.score(X_train, y_train)))
print('Accuracy of Linear SVC classifier on test set: {:.2f}'
     .format(clf.score(X_test, y_test)))

Accuracy of Linear SVC classifier on training set: 0.99
Accuracy of Linear SVC classifier on test set: 0.94

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM