简体   繁体   English

用于分类的随机森林树

[英]Random Forest Tree for classification

字符串(数据1) I am trying RF for the 1st time.我第一次尝试射频。 I am trying to predict the genre of the game based on the factors我正在尝试根据因素预测游戏的类型

data <- read.csv("appstore_games.csv")
data <- data %>% drop_na()
data <- data %>% select(Average.User.Rating, User.Rating.Count, Price, Age.Rating, Genres)
data <- data %>% separate(Genres, c("Main Genre","Genre1","Genre2","Genre3"), extra = "drop" )
data1 <- data %>% select(Genre1 , Average.User.Rating, User.Rating.Count, Price )
str(data1)
data1$Genre1 <- as.factor(data1$Genre1)
set.seed(123)
sample <- sample(2 , nrow(data1),replace = TRUE, prob = c(0.7,0.3))
train_data <- data1[sample == 1,]
test_data <- data1[sample == 2,]
library(randomForest)
set.seed(1)
rf <- randomForest(train_data$Genre1 ~., data = train_data , proximity = TRUE, ntree = 200, importance = TRUE)

It shows error at this point Error in randomForest.default(m, y, ...): Can't have empty classes in y.此时显示错误 randomForest.default(m, y, ...) 中的错误:y 中不能有空类。

Can I know what is wrong here?我能知道这里有什么问题吗? Thanks The genre has names such as Strategy, Entertainment, etc谢谢 该类型有策略,娱乐等名称

I am not completely sure, but I think that could happen if not all different levels of your Y is represented in the train data.我不完全确定,但我认为如果不是所有不同级别的 Y 都在火车数据中表示,则可能会发生这种情况。 Maybe you check this.也许你检查一下。

My other idea is that one of your classes in Y is "None".我的另一个想法是,您在 Y 中的一个课程是“无”。

train_data <- droplevels(train_data) Try using this before you pass data to the model train_data <- droplevels(train_data) 在将数据传递给 model 之前尝试使用它

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM