简体   繁体   中英

Limit the number of support vectors in R svm package e1071?

I use the svm function in package e1071. As far as I understand, svm basic functionality can separate two linearly separable classes with an hyperplane (Support vectors). More advanced features allow you to perform numeric regressions and non-linear separation. When I started testing it on some data I could not understand why the output of the model is a huge set of Support Vectors. When I ran it on some very simple test examples I got the same results. Here is an example:

X = rnorm(1000)
Y = rnorm(1000)
data = data.frame(X, Y, Z = as.factor(X + Y > 0))
model = svm(formula = Z ~ X + Y, data = data, kernel = "linear")

Here is the result:

Call:
svm(formula = Z ~ X + Y, data = data, kernel = "linear")


Parameters:
  SVM-Type:  C-classification 
SVM-Kernel:  linear 
  cost:  1 
  gamma:  0.5 

Number of Support Vectors:  102

This example is clearly linearly separable by only one Support Vector. If not convinced you can run:

ggplot(data, aes(X, Y, col = Z)) + geom_point()

What is the meaning of the 102 Support Vectors? Why isn't the number of Support Vectors a parameter?

Thanks

There is a distinction between a soft-margin and a hard-margin svm. This is controlled by the cost parameter which penalises the degree to which the support vectors are allowed to violate the margin constraint. So, you will get the minimal number of support vectors (which is three and not one) only in the case where your cost parameter is (very) large. Try for example values for the cost parameter > 10000 and you will get three support vectors.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM