简体   繁体   中英

R- Random forest predict fails with NAs in predictors

The documentation (If I'm reading it correctly) says that the random forest predict function produces NA predictions if it encounters NA predictors for certain observations.

NOTE: If the object inherits from randomForest.formula, then any data with NA are silently omitted from the prediction. The returned value will contain NA correspondingly in the aggregated and individual tree predictions (if requested), but not in the proximity or node matrices

However, if I try to use the predict function on a dataset with some NA's in predictors [NA's in 7 observations out of 2688] I encounter the following error condition, and prediction fails.

Error in predict.randomForest(model, new.ds) : missing values in newdata

There is a slightly messy work-around that I would like to avoid if possible.

Am I doing/reading something wrong? Does it have to do something with the "inherits from randomForest.formula" clause?

Using some examples from the documentation:

set.seed(1)
x <- data.frame(x1=gl(32, 5), x2=runif(160), y=rnorm(160))
rf1 <- randomForest(x[-3], x[[3]], ntree=10)
> inherits(rf1,"randomForest.formula")
[1] FALSE

> iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE,
                         proximity=TRUE)
> inherits(iris.rf,"randomForest.formula")
[1] TRUE

So you probably called randomForest without using the formula interface to fit your model.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM