简体繁体 English

来自libsvm的100％准确性

[英]100% accuracy from libsvm

原文 2014-01-27 23:08:56 3 1 machine-learning/ classification/ svm/ libsvm/ cross-validation

I'm training and cross-validating (10-fold) data using libSVM (with linear kernel). 我正在使用libSVM（带有线性内核）训练和交叉验证（10倍）数据。

The data consist 1800 fMRI intensity voxels represented as a single datapoint. 数据由1800 fMRI强度体素表示为单个数据点。 There are around 88 datapoints in the training-set-file for svm-train. svm-train的训练集文件中大约有88个数据点。

the training-set-file looks as follow: 训练集文件如下所示：

+1 1:0.9 2:-0.2 ... 1800:0.1

-1 1:0.6 2:0.9 ... 1800:-0.98

...

I should also mention i'm using the svm-train script (came along with the libSVM package). 我还应该提到我正在使用svm-train脚本（与libSVM软件包一起提供）。

The problem is that when running svm-train - it's result as 100% accuracy! 问题是，在运行svm-train时-结果是100％的准确性！

This doesn't seem to reflect the true classification results! 这似乎没有反映出真正的分类结果！ The data isn't unbalanced since 数据并非不平衡，因为

#datapoints labeled +1 == #datpoints labeled -1

Iv'e also checked the scaler (scaling correctly), and also tried to change the labels randomly to see how it impacts the accuracy - and it's decreasing from 100% to 97.9%. Iv'e还检查了缩放器（正确缩放），还尝试随机更改标签以查看它如何影响准确性-并将其从100％降低到97.9％。

Could you please help me understand the problem? 你能帮我理解这个问题吗？ If so, what can I do to fix it? 如果是这样，我该如何解决？

Thanks, 谢谢，

Gal Star 加尔星

1 个解决方案

Make sure you include '-v 10' in the svmtrain option. 确保在svmtrain选项中包括'-v 10' 。 I'm not sure your 100% accuracy comes from training sample or validation sample. 我不确定您的100％准确性来自训练样本还是验证样本。 It is very possible to get a 100% training accuracy since you have much less sample number than the feature number. 因为您的样本数比特征数少得多，所以很有可能获得100％的训练精度。 But if your model suffers from overfitting, the validation accuracy may be low. 但是，如果您的模型存在过度拟合问题，则验证准确性可能会很低。