Error in formula and no data argument in R

Question

I was writing a code using sim with polynomial kernel. The code is as follows.

library(ISLR)
library(e1071)
library(randomForest)
library(class)
library(ggplot2)
library(GGally)

train = subset(wifiLocDat, Loc3 == TRUE)
test  = subset(wifiLocDat, Loc3 == FALSE)
set.seed(4343)
tune.out <- tune(svm, wifiLocDat$Loc3~wifiLocDat$WiFi1 + wifiLocDat$WiFi2 + wifiLocDat$WiFi3 + wifiLocDat$WiFi4 + wifiLocDat$WiFi5 + wifiLocDat$WiFi6 + wifiLocDat$WiFi7, data=wifiLocDat,       kernel="polynomial", ranges=list(degree=c(1,2,3,4,5,6)))
summary(tune.out)
svmPoly <- svm(Train$Loc3~., data=Train, kernel="polynomial",coef0=1,degree = 3)

dput(head(wifiLocDat,20)) structure(list(WiFi1 = c(-64L, -68L, -63L, -61L, -63L, -64L, -65L, -61L, -65L, -62L, -67L, -65L, -63L, -66L, -61L, -67L, -63L, -60L, -60L, -62L), WiFi2 = c(-56L, -57L, -60L, -60L, -65L, -55L, -61L, -63L, -60L, -60L, -61L, -59L, -57L, -60L, -59L, -60L, -56L, -54L, -58L, -59L), WiFi3 = c(-61L, -61L, -60L, -68L, -60L, -63L, -65L, -58L, -59L, -66L, -62L, -61L, -61L, -65L, -65L, -59L, -60L, -59L, -60L, -63L), WiFi4 = c(-66L, -65L, -67L, -62L, -63L, -66L, -67L, -66L, -63L, -68L, -67L, -67L, -65L, -62L, -63L, -61L, -62L, -65L, -61L, -64L), WiFi5 = c(-71L, -71L, -76L, -77L, -77L, -76L, -69L, -74L, -76L, -80L, -77L, -72L, -73L, -70L, -74L, -71L, -70L, -73L, -73L, -70L), WiFi6 = c(-82L, -85L, -85L, -90L, -81L, -88L, -87L, -87L, -86L, -86L, -83L, -86L, -84L, -85L, -89L, -86L, -84L, -83L, -84L, -84L), WiFi7 = c(-81L, -85L, -84L, -80L, -87L, -83L, -84L, -82L, -82L, -91L, -91L, -81L, -84L, -83L, -87L, -91L, -91L, -84L, -88L, -84L), Loc3 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("FALSE", "TRUE"), class = "factor")), row.names = c(NA, 20L), class = "data.frame")

I got the error: Error in terms.formula(formula, data = data): '.'in formula and no 'data' argument

What am I doing wrong?

Answer 1

There are some issues with your code: first of all, I don't think it is a good idea to split your data in training and test based on the value of the outcome as you do. In that way, your training won't contain all the levels of the outcome, the same for the test set.

Here I post an example that works without problem (I had to add the missing level in the data because evidently, the first 20 rows of your data.frame did not contain TRUE values in the outcome):

#for the crateDataPartition function
library(caret)

#add some TRUE in the outcome
wifiLocDat$Loc3 <- c(rep(F,10),rep(T,10))
#transform back to factor
wifiLocDat$Loc3 <- as.factor(wifiLocDat$Loc3)
#create index for data splitting
ind <- createDataPartition(wifiLocDat$Loc3,p=0.7,list = F)
train<- wifiLocDat[ind,]
test <- wifiLocDat[-ind,]

set.seed(4343)
tune.out <- tune("svm",Loc3~.,data = wifiLocDat,kernel="polynomial", ranges=list(degree=c(1,2,3,4,5,6)))
summary(tune.out)
svmPoly <- svm(Loc3~., data=train, kernel="polynomial",coef0=1,degree = 3)

Here all it works with no problem.

However, with the sample data you posted I cannot reproduce your error, but when I run your code I get another error

Error in predict.svm(ret, xhold, decision.values = TRUE): Model is empty!

I think because you don't have all the possible values of the outcome in the training set

Error in formula and no data argument in R

Question

1 answers

solution1
0 2021-04-29 13:52:34

Error in formula and no data argument in R

Question

1 answers

solution1 0 2021-04-29 13:52:34

solution1
0 2021-04-29 13:52:34