简体   繁体   English

在R中调整线性SVM模型时,具有负epsilon是否有意义?

[英]Does it make sense to have negative epsilon when tuning a linear-SVM model in R?

I'm using the following tuning code to find the best case and epsilon for my svn model. 我正在使用以下调整代码来为我的svn模型找到最佳情况和epsilon。

tuneResult <- tune(
    svm, 
    labels ~ ., 
    data = dataset, 
    ranges = list(epsilon = seq(-5.0, 5, 0.1), cost = 2^(0:3)))

But surprisingly it suggests cost = 4 and epsilon = -5 ! 但是令人惊讶的是,它表明cost = 4epsilon = -5

Then I trained the model using these parameters and tested with confusionMatrix . 然后,我使用这些参数训练了模型,并使用confusionMatrix进行了测试。 Unfortunately, the model is not as accurate as a model without these parameters. 不幸的是,该模型不如没有这些参数的模型准确。

model1 <-  svm(labels ~ ., data = dataset, kernel = "linear", cost = 4 , epsilon = -5)
model2 <-  svm(labels ~ ., data = dataset, kernel = "linear")

Am I missing something here? 我在这里想念什么吗?

tldr; tldr;

The issue is in your tuneResult command, where you allow epsilon to vary in the range [-5, +5] , which makes no sense as epsilon is defined for values >=0 . 问题出在您的tuneResult命令中,您在其中允许epsilon[-5, +5]范围内变化,因为为值>=0定义epsilon没有意义。 The fact that tuneResult returns epsilon = -5 suggests a convergence failure/issue when trying to find an optimal set of (hyper)parameters. tuneResult返回epsilon = -5的事实表明,在尝试找到最佳的(超)参数集时会发生收敛失败/问题。 Unfortunately, without (sample) data it is hard to get a feeling for any (potential) computational challenges in the classification model. 不幸的是,没有(样本)数据,很难对分类模型中的任何(潜在)计算挑战有所了解。


The role/interpretation of epsilon epsilon的作用/解释

Just to recap: In SVMs, epsilon describes the tolerance margin (the "insensitivity zone") within which classification errors are not penalised (you should take a look at ?e1071::svm to find out about the default value for epsilon ). 简要说明一下:在SVM中, epsilon描述了容差裕度(“不敏感区域”),在该容限内不会对分类错误进行惩罚(您应查看?e1071::svm来了解epsilon的默认值)。 In the limit of epsilon approaching zero from the right, all classification errors are penalised, resulting in a maximal number of support vectors (as a function of epsilon ). epsilon从右向零逼近的极限中, 所有分类错误都会受到惩罚,从而导致支持向量数量最多(作为epsilon的函数)。 See eg here for a lot more details on the interpretation/definition of the various SVM (hyper)parameters. 有关各种SVM(超级)参数的解释/定义,请参见此处的更多详细信息。

Hyperparameter optimisation and convergence 超参数优化和收敛

Let's return to the question why the optimisation convergence failed: I think the issue arises from trying to simultaneously optimise both the cost and epsilon parameters. 让我们回到为什么优化收敛失败的问题:我认为这个问题源于同时优化costepsilon参数的问题。 As epsilon gets smaller and smaller, you penalise misclassifications more and more (reducing the number of support vectors); 随着epsilon变得越来越小,对错误分类的惩罚越来越多(减少支持向量的数量)。 at the same time , by allowing for greater and greater cost parameters you allow for more and more support vectors to be included to counter-balance misclassifications from small epsilon s. 同时 ,通过允许越来越多的cost参数,您可以包含越来越多的支持向量,以抵消小epsilon的误分类。 During cross-validation this essentially drives the model to smaller and smaller epsilon and larger and larger cost hyperparameters. 在交叉验证期间,这实质上将模型驱动到越来越小的epsilon和越来越大的cost超参数。

An example 一个例子

We can reproduce this behaviour using some simulated data for an SVM classification problem. 我们可以使用一些用于SVM分类问题的模拟数据来重现此行为。

  1. Let's generate some sample data 让我们生成一些样本数据

     # Sample data set.seed(1) x <- rbind(matrix(rnorm(10 * 2, mean = 0), ncol = 2), matrix(rnorm(10 * 2, mean = 2), ncol = 2)) y <- c(rep(-1, 10), rep(1, 10)) df <- data.frame(x = x, y = as.factor(y)) 
  2. Let's simultaneously tune the epsilon and cost hyperparameters. 让我们同时调整epsilon和成本超参数。 We use the same ranges as in your original post, including the nonsensical (ie negative) epsilon values. 我们使用与您原始帖子相同的范围,包括无意义(即负)的ε值。

     # tune epsilon and cost hyper-parameters library(caret) tuneResult <- tune( svm, y ~ ., data = df, ranges = list(epsilon = seq(-5, 5, 0.01), cost = 2^(0:3)) ) # #Parameter tuning of 'svm': # #- sampling method: 10-fold cross validation # #- best parameters: # epsilon cost # -5 4 # #- best performance: 0.1 

    You can see how the epsilon and cost parameters tend to their respective minimal/maximal extremes. 您会看到epsiloncost参数如何趋向于各自的最小/最大极限。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM