[英]Train a SVM (Support Vector Machine) classifier with Scikit-learn
I want to train different classifier with using Scikit-learn
with following code for Multi-label classification problem: 我想使用
Scikit-learn
和以下代码针对多标签分类问题来训练不同的分类 器 :
names = [
"Nearest Neighbors",
"Linear SVM", "RBF SVM", "Gaussian Process",
"Decision Tree", "Random Forest", "Neural Net", "AdaBoost",
"Naive Bayes", "QDA"]
classifiers = [
KNeighborsClassifier(3),
SVC(C=0.025),
SVC(gamma=2, C=1),
GaussianProcessClassifier(1.0 * RBF(1.0)),
DecisionTreeClassifier(max_depth=5),
RandomForestClassifier(max_depth=5),
MLPClassifier(alpha=0.5),
AdaBoostClassifier(),
GaussianNB(),
QuadraticDiscriminantAnalysis()]
for name, clf in izip(names, classifiers):
clf.fit(X_train, Y_train)
score = clf.score(X_train, Y_test)
print name, score
KNeighbors
classifier works properly but when I reach to the SVM classifier it throws following exception: KNeighbors
分类器可以正常工作,但是当我到达SVM分类器时,它将引发以下异常:
Traceback (most recent call last):
File "/Users/mac/PycharmProjects/GraphLstm/classifier.py", line 87, in <module>
clf.fit(X_train, Y_train)
File "/Library/Python/2.7/site-packages/sklearn/svm/base.py", line 151, in fit
X, y = check_X_y(X, y, dtype=np.float64, order='C', accept_sparse='csr')
File "/Library/Python/2.7/site-packages/sklearn/utils/validation.py", line 526, in check_X_y
y = column_or_1d(y, warn=True)
File "/Library/Python/2.7/site-packages/sklearn/utils/validation.py", line 562, in column_or_1d
raise ValueError("bad input shape {0}".format(shape))
ValueError: bad input shape (9280, 39)
What's the reason and How can I fix that? 是什么原因,我该如何解决?
Edit: As commented by @Vivek following classifier only allowed for Multi-label classification : 编辑:正如@Vivek所评论的,以下分类器仅适用于多标签分类 :
sklearn.tree.DecisionTreeClassifier
sklearn.tree.ExtraTreeClassifier
sklearn.ensemble.ExtraTreesClassifier
sklearn.neighbors.KNeighborsClassifier
sklearn.neural_network.MLPClassifier
sklearn.neighbors.RadiusNeighborsClassifier
sklearn.ensemble.RandomForestClassifier
sklearn.linear_model.RidgeClassifierCV
The fit function of the knn classifier allows a matrix as y-value. knn分类器的拟合函数允许将矩阵作为y值。 For the svm this is not allowed.
对于svm,这是不允许的。 The error message tries to hint you on a disallowed y-shape
错误消息试图提示您使用不允许的Y形
Since this is a multi-label classification problem, not all estimators in scikit will be able to handle them inherently. 由于这是一个多标签分类问题,因此并不是scikit中的所有估计器都能够固有地处理它们。 The documentation provides a list of estimators which support multi-label out of the box like various tree based estimators or others :
该文档提供了一个估算器列表,这些估算器支持开箱即用的多标签,例如各种基于树的估算器或其他:
sklearn.tree.DecisionTreeClassifier
sklearn.tree.ExtraTreeClassifier
sklearn.ensemble.ExtraTreesClassifier
sklearn.neighbors.KNeighborsClassifier
...
...
However there are strategies like one-vs-all which can be employed to train the required estimator (which doesn't support multilabel directly). 但是,可以采用诸如“ 一对多”的策略来训练所需的估计量(不直接支持多标签)。 Sklearn estimator OneVsRestClassifier is made for this.
为此创建了Sklearn估计器OneVsRestClassifier 。
See the documentation here for more details about it. 有关更多详细信息,请参见此处的文档 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.