简体繁体 English

如何使用libSVM（RBF内核）选择C和gamma AFTER网格搜索以获得最佳可能的推广？

[英]How to choose C and gamma AFTER grid search using libSVM (RBF kernel) for best possible generalisation?

原文 2014-09-10 19:17:28 4 1 machine-learning/ kernel/ svm/ libsvm/ cross-validation

I am aware of the abundance of questions asking about choosing the 'best' C and gamma values for SVM (RBF kernel). 我知道有很多问题要求为SVM选择“最佳”C和gamma值（RBF内核）。 The standard answer is a grid search, however, my questions starts after the results of the grid search. 标准答案是网格搜索，但是，我的问题在网格搜索结果之后开始。 Let me explain: 让我解释：

I have a data set of 10 subjects on which I perform leave-one-subject-out-xfold-validation meaning I perform a grid search on each left-out subject. 我有一个10个主题的数据集，我在其上执行假一个主题xfold验证意味着我对每个左外主题执行网格搜索。 In order to not optimise on this training data I do not want to choose the best C and gamma parameter by building the mean accuracy over all 10 models and search for the maximum. 为了不对此训练数据进行优化，我不希望通过在所有10个模型上建立平均精度来选择最佳C和伽玛参数，并搜索最大值。 Considering one model within the xfold, I could perform another xfold only on the training data wihtin this model (not involving the left out validation subject). 考虑到xfold中的一个模型，我只能对该模型的训练数据执行另一个xfold（不涉及遗漏的验证主题）。 But you can imagine the computational effort and I do not have enough time atm for this. 但你可以想象计算工作量，我没有足够的时间用于此。

Since the grid search for each of the 10 models resulted in a wide range of good C and gamma parameters (difference between accuracy of only 2-4%, see Figure 1) I thought about a different way. 由于对10个模型中的每个模型的网格搜索 产生了大范围的良好C和伽马参数（精度差异仅为2-4％，见图1），我想到了一种不同的方式。

I defined a region within the grid, which only contains the accuracies that have a difference of 2% to the maximum accuracy of this grid. 我在网格中定义了一个区域，该区域仅包含与该网格的最大精度相差2％的精度。 All other accuracy values with a difference higher than 2% are set to zero (see Figure 2). 差值高于2％的所有其他精度值均设置为零（参见图2）。 I do this for every model and build the intersect between the regions of every model. 我为每个模型执行此操作，并在每个模型的区域之间构建交叉。 This results in a much smaller region of C and gamma values that would produce accuracies within 2% of the max. 这导致C和γ值的小得多的区域将产生在最大值的2％内的精度。 accuracy for each model. 每种型号的准确性。 However, the range is still rather big. 但是，范围仍然很大。 So I thought about choosing the C-gamma pair with the lowest C as this would mean that I am the furthest away from overfitting and closer to a good generalisation. 所以我考虑选择具有最低C的C-gamma对，因为这意味着我距离过度拟合最远，并且更接近于良好的泛化。 Can I argue like that? 我能这样说吗？

How would I generally choose a C and gamma within this region of C-gamma pairs, which all proofed to be reliable adjustments for my classifier in all 10 models? 我通常如何在这个C-gamma对区域中选择C和gamma，这些都证明了我的分类器在所有10个模型中的可靠调整？ Should I focus on minimising the C parameter? 我应该专注于最小化C参数吗？ Or should I focus on minimising the C AND the gamma paramater? 或者我应该专注于最小化C和伽马参数？

I found a related answer here ( Are high values for c or gamma problematic when using an RBF kernel SVM? ) that says a combination of high C AND high gamma would mean overfitting. 我在这里找到了一个相关的答案（当使用RBF内核SVM时，c或gamma的值是否有问题？）说高C和高gamma的组合意味着过度拟合。 I understood that the value of gamma changes the width of the gaussian curve around data points, but I still cant get my head around what it practically means within a data set. 我知道伽玛的值会改变数据点周围的高斯曲线的宽度，但我仍然无法理解它在数据集中的实际意义。

The post brought me to another idea. 这篇文章让我想到了另一个想法。 Could I use the number of SVs related to the number of data points as a criterium to choose between all the C-gamma pairs? 我可以使用与数据点数量相关的SV数作为标准来在所有C-gamma对之间进行选择吗？ A low (number of SVs/number of data points) would mean a better generalisation? 较低（SV数/数据点数）是否意味着更好的推广？ I am willing to loose accuracy as it shouldnt effect the outcome I am interested in, if I get in return a better generalisation (at least from a theoretical point of view). 我愿意放松准确性，因为它不应该影响我感兴趣的结果，如果我得到更好的概括（至少从理论的角度来看）。

网格搜索后的平衡公告

遵循我的地区并与标准相交的均衡公告

1 个解决方案

Since linear kernel is a special case of rbf kernel. 由于线性内核是rbf内核的特例。 There is a method using linear SVM to tune C first. 有一种使用线性SVM首先调整C的方法。 And bilinear tuning CG pair later to save time. 然后双线性调整CG对以节省时间。

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.141.880&rep=rep1&type=pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.141.880&rep=rep1&type=pdf