简体   繁体   English

如何提高python中支持向量机的准确性?

[英]How can I rise the accuracy in support vector machine in python?

I've been trying to fit some data and predict them.I'm using SVC function in sklearn to train them.My problem is that my data are so complicated and I don't know how to classify them.I'm Uploading a 3d figure here .The dataset includes about 800 rows with 3 columns.I used gamma=100 and C=10.0 and after splitting the data set and test them i got accuracies between 61.0 and 64.0 percent.but i think i can do better than these.i set kernel 'rbf' and after some tests i understood that 'rbf' is good choice.but after reading the documentation of svm here and the kernel functions here i got confused.here are my questions:1.Which kernel should i use(based on my dataset which is uploaded here)?2.what other parameters should i change for classification task?我一直在尝试拟合一些数据并预测它们。我在 sklearn 中使用 SVC 函数来训练它们。我的问题是我的数据太复杂了,我不知道如何对它们进行分类。我正在上传一个3d 图在这里。数据集包括大约 800 行和 3 列。我使用了 gamma=100 和 C=10.0,在拆分数据集并测试它们之后,我得到了 61.0% 和 64.0% 之间的准确度。但我认为我可以做得比这些更好.我设置内核'rbf',经过一些测试,我明白'rbf'是不错的选择。但是在阅读了这里的svm文档和这里的内核函数后我感到困惑。这是我的问题:1.我应该使用哪个内核(基于我在这里上传的数据集)?2.我应该为分类任务更改哪些其他参数? help me to get good accuracy here is my dataset:帮助我获得良好的准确性,这是我的数据集:

from sklearn import svm
from sklearn.model_selection import train_test_split
model=svm.SVC(C=1.0,gamma=100,kernel='rbf')
X_train, X_test, y_train, y_test = train_test_split(X, labels)
model.fit(X_train,y_train)
print(model.predict(X_test))
print('\n\n\n',y_test,'\n\n\n',

( np.array(y_test)==model.predict(X_test)).sum()/(np.array(y_test).shape))

在此处输入图片说明

Just note: You actually did not provide any dataset, just the source code.请注意:您实际上没有提供任何数据集,只是提供了源代码。

Using different kernel seems like a good idea.使用不同的内核似乎是个好主意。 Only from that image it'S really hard to say which kernel will perform better than the others, usually the choice of kernel requires some intuition or domain knowledge, so it's hard to say that offhand.仅从该图像很难说哪个内核会比其他内核表现得更好,通常内核的选择需要一些直觉或领域知识,所以很难说。

Since there are only 4 kernels in scikit-learn, I think you should just try all of them and compare them, maybe using crossvalidation, to see which performs the best.由于 scikit-learn 中只有 4 个内核,我认为您应该尝试所有内核并进行比较,也许使用交叉验证,看看哪个性能最好。 Some of the kernels are parametrized, and there you may try multiple kernels, up to degree 10. Using bigger degree than 10 for polynomial kernel might not help anything, but that's just my guess.一些内核是参数化的,在那里你可以尝试多个内核,最多 10 次。对多项式内核使用大于 10 的次数可能没有任何帮助,但这只是我的猜测。

You also should try different valus for the C parameter.您还应该为 C 参数尝试不同的值。 In most machine learning algorithms, the constants weighting individual losses in multi-task training (which is the case also here), have "multiplicative" impact (for lack of better words), so I advice to use to use following values for C: [1e-3, 1e-2, 1e-1, 1, 10, 100]在大多数机器学习算法中,在多任务训练中加权个体损失的常数(这里也是这种情况)具有“乘法”影响(因为缺少更好的词),因此我建议使用以下值作为 C: [1e-3, 1e-2, 1e-1, 1, 10, 100]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 支持向量机 Python 3.5.2 - Support Vector Machine Python 3.5.2 scikit-learn:支持向量机。 精度和/或准确度? - scikit-learn: Support Vector Machine. Precision and/or accuracy? 如何提高线性回归模型的准确性?(使用python进行机器学习) - How can I increase the accuracy of my Linear Regression model?(machine learning with python) 支持向量机:Python错误消息 - Support Vector Machine: Python Error Message 支持向量机算法在python中的使用 - Use of support vector machine algorithm in python 在python中支持向量机分类器的替代方法? - Alternative to support vector machine classifier in python? 如何检查支持向量机中的哪个数组属于哪个label Python - How to check which array in support vector machine belongs to which label Python 我想为我的机器学习算法考虑python中数据的功能集(向量)。 我该怎么做? - I would like to consider a feature set(vector) for a data in python for my machine learning algorithm. How can I do it? 使用python NLTK:如何提高POS标记器的准确性? - Working with the python NLTK: How can I improve the accuracy of the POS tagger? 支持向量机SVM python ValueError:X.shape [1] - Support Vector machine SVM python ValueError: X.shape[1]
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM