简体   繁体   English

预测变量中不允许出现随机森林错误NA

[英]random forest error NA not permitted in predictors

Hi I am using the following r script to build a random forest: 嗨,我正在使用以下r脚本来构建随机森林:

# load the necessary libraries                      
library(randomForest)


testPP<-numeric()


# load the dataset
QdataTrain <- read.csv('train.csv',header = FALSE)
QdataTest <- read.csv('test.csv',header = FALSE)

QdataTrainX <- subset(QdataTrain,select=-V1)
QdataTrainY<-as.factor(QdataTrain$V1)   

QdataTestX <- subset(QdataTest,select=-V1)
QdataTestY<-as.factor(QdataTest$V1)
mdl <- randomForest(QdataTrainX, QdataTrainY) 

where I am getting the following error: 我收到以下错误:

Error in randomForest.default(QdataTrainX, QdataTrainY) : 
  NA not permitted in predictors

however i see no occurence of NA in my data. 但是我看不到数据中不存在NA。

for reference here is my data: 供参考的是我的数据:

https://docs.google.com/file/d/0B0iDswLYaZ0zUFFsT01BYlRZU0E/edit

does anyone know why this error is being thrown? 有谁知道为什么会引发此错误? I'll keep looking in the mean time. 在此期间,我将继续寻找。 Thanks in advance for any help! 在此先感谢您的帮助!

The given data does contain some missing values (7 in particular): 给定的数据确实包含一些缺失值(尤其是7):

sapply(QdataTrainX, function(x) sum(is.na(x)))

## V2  V3  V4  V5  V6  V7  V8  V9 V10 V11 V12 V13 V14 V15 V16 
## 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 
## V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 
## 0   0   0   0   0   0   1   1   1   1   1   1   1 

Therefore columns V23 to V29 have one missing value each 因此,列V23至V29每个都有一个缺失值

which(is.na(QdataTrainX$V23))

## 318

Gives the row number for that. 给出行号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 randomForest错误:预测变量中不允许使用NA(但数据中不允许使用NA) - randomForest Error: NA not permitted in predictors (but no NAs in data) 预测变量中不允许显示随机森林错误 NA - randomForest error shown NA not permitted in predictors 预测变量中不允许使用 NA。 森林小姐 - NA not permitted in predictors. missForest r 随机森林错误 - 新数据中的预测变量类型不匹配 - r random forest error - type of predictors in new data do not match R-随机森林预测因预测变量中的NA而失败 - R- Random forest predict fails with NAs in predictors 如何使用随机森林 model 反向计算给定因变量的预测变量? - How to reversely calculate predictors for given dependent variable with a random forest model? Predict() 仅使用随机森林返回 NA - Predict() returns only NA with Random Forest 随机森林-为什么会出现错误na.fail.default - random forest- Why do i get an error na.fail.default 带有方包的随机森林不能处理超过4个级别的分类预测变量 - Random Forest with party package cannot handle categorical predictors with more than 4 levels Tidymodels 包:使用 ggplot() 可视化随机森林模型以显示最重要的预测变量 - Tidymodels Package: Visualising a random forest model using ggplot() to show the most important predictors
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM