简体   繁体   English

在R中执行插入符号调整时出错

[英]Errors while performing caret tuning in R

I am building a predictive model with caret/R and I am running into the following problems: 我正在使用插入号/ R建立预测模型,但遇到了以下问题:

  1. When trying to execute the training/tuning, I get this error: 尝试执行训练/调整时,出现以下错误:

Error in if (tmps < .Machine$double.eps^0.5) 0 else tmpm/tmps : missing value where TRUE/FALSE needed if(tmps <.Machine $ double.eps ^ 0.5)中的错误0 else tmpm / tmps:需要TRUE / FALSE时缺少值

After some research it appears that this error occurs when there missing values in the data, which is not the case in this example (I confirmed that the data set has no NAs). 经过一些研究,似乎在数据中缺少值时会发生此错误,在本示例中不是这种情况(我确认数据集没有NA)。 However, I also read somewhere that the missing values may be introduced during the re-sampling routine in caret, which I suspect is what's happening. 但是,我还在某处读到,在插入符号的重新采样例程中可能会引入缺失值,我怀疑这是正在发生的事情。

  1. In an attempt to solve problem 1, I tried "pre-processing" the data during the re-sampling in caret by removing zero-variance and near-zero-variance predictors, and automatically inputting missing values using a carets knn automatic imputing method preProcess(c('zv','nzv','knnImpute')), , but now I get the following error: 为了解决问题1,我尝试了在插入符号的重新采样过程中对数据进行“预处理”,方法是删除零方差和接近零方差的预测变量,并使用插入符号knn自动preProcess(c('zv','nzv','knnImpute')),方法preProcess(c('zv','nzv','knnImpute')),自动输入缺失值preProcess(c('zv','nzv','knnImpute')), ,但现在出现以下错误:

Error: Matrices or data frames are required for preprocessing 错误:预处理需要矩阵或数据帧

Needless to say I checked and confirmed that the input data set are indeed matrices, so I dont understand why I get this second error. 不用说我检查并确认输入数据集确实是矩阵,所以我不明白为什么会遇到第二个错误。

The code follows: 代码如下:

x.train <- predict(dummyVars(class ~ ., data = train.transformed),train.transformed)
y.train <- as.matrix(select(train.transformed,class))
vbmp.grid <- expand.grid(estimateTheta = c(TRUE,FALSE))
adaptive_trctrl <- trainControl(method = 'adaptive_cv',
                   number = 10, 
                   repeats = 3,
                   search = 'random',
                   adaptive = list(min = 5, alpha = 0.05, 
                                   method = "gls", complete = TRUE),
                   allowParallel = TRUE)
fit.vbmp.01 <- train( 
                 x = (x.train),
                 y = (y.train),
                 method = 'vbmpRadial',
                 trControl = adaptive_trctrl,
                 preProcess(c('zv','nzv','knnImpute')),
                 tuneGrid = vbmp.grid)

The only difference between the code for problem (1) and (2) is that in (1), the pre-processing line in the train statement is commented out. 问题(1)和(2)的代码之间的唯一区别是,在(1)中,train语句中的预处理行被注释掉。

In summary, 综上所述,

-There are no missing values in the data -数据中没有缺失值

-Both x.train and y.train are definitely matrices -x.train和y.train绝对是矩阵

-I tried using a standard 'repeatedcv' method in instead of 'adaptive_cv' in trainControl with the same exact outcome -我尝试在trainControl使用标准的'repeatedcv'方法而不是'adaptive_cv' ,具有相同的确切结果

-Forgot to mention that the outcome class has 3 levels -忘了说结果class有3个等级

Anyone has any suggestions as to what may be going wrong? 任何人对可能出什么问题有任何建议?

As always, thanks in advance 一如既往,在此先感谢

reyemarr reyemarr

I had the same problem with my data, after some digging i found that I had some Inf (infinite) values in one of the columns. 经过一番挖掘后,我的数据也遇到了同样的问题,我发现其中一列中有一些Inf (无限)值。

After taking them out ( df <- df %>% filter(!is.infinite(variable)) ) the computation ran without error. 将它们取出后( df <- df %>% filter(!is.infinite(variable)) ),计算运行没有错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM