简体   繁体   English

使用“ rf”方法的火车功能收到错误消息

[英]Received an error message using the train function with “rf” method

I tried the example posted on this site and followed the exact steps until the train function. 我尝试了此站点上发布的示例,并按照确切的步骤进行操作,直到train功能正常为止。

library(dplyr)

data_train <- read.csv("https://raw.githubusercontent.com/guru99-edu/R- 
    Programming/master/train.csv")

glimpse(data_train)

data_test <- read.csv("https://raw.githubusercontent.com/guru99-edu/R-    
    Programming/master/test.csv") 

glimpse(data_test)

library(randomForest)

library(caret)

library(e1071)

trControl <- trainControl(method = "cv",
    number = 10,
    search = "grid")

set.seed(1234)

rf_default <- train(Survived~., 
    data = data_train,
    method = "rf",
    metric = "Accuracy",
    trControl = trControl)

I used 我用了

R versions 3.5.1 and 3.6.1

Error in na.fail.default(list(Survived = c(0L, 1L, 1L, 1L, 0L, 0L, 0L, : missing values in object. However, there is no missing values in "Survived" variable. na.fail.default(list(Survived = c(0L,1L,1L,1L,0L,0L,0L,:缺少对象中的值。但是,在“ Survived”变量中没有任何值。

Could someone tell me what's wrong please? 有人可以告诉我怎么了吗? I used R version 3.5.1, and tried on 3.6.1 as well. 我使用R版本3.5.1,并尝试在3.6.1上进行。 thank you 谢谢

There are a few issues. 有几个问题。 The first being that you have NA s in there. 首先是您那里有NA You can either impute these or just omit them. 您可以估算这些值,也可以忽略它们。 For simplicity, I have omitted them. 为了简单起见,我省略了它们。

Second, you need to use a factor for classification. 其次,您需要使用一个因素进行分类。 set.seed(1234) set.seed(1234)

new_data<-na.omit(data_train)
as_tibble(new_data) %>% 
  mutate(Survived = as.factor(Survived)) -> new_data
rf_default <- train(Survived~., 
                    data = new_data,
                    method = "rf",
                    metric = "Accuracy",
                    trControl = trControl)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM