简体   繁体   English

在R中运行randomForest循环和变量重要性

[英]running randomForest loop and variable importance in R

I would like to run 100 times randomForest regression in R, get variable importance for each running and write the result of variable importance as csv file(including 100 results for variable importance). 我想在R中运行100次randomForest回归,为每次运行获取变量重要性,并将变量重要性的结果写为csv文件(包括变量重要性的100个结果)。 This is my code and its error: 这是我的代码及其错误:

result<-data.frame(IncMSE="%IncMSE", IncNodePurity="IncNodePurity")
for (i in 1:3){
imp[i]<- importance(randomForest(train[,1:11], train[,12], data = train,importance = TRUE, ntree =5000, proximity = TRUE, mtry=3))
results<-cbind(result,imp[i])  
}
write.csv(results,"D:/vari.csv")

Warning messages:
In imp[i] <- importance(randomForest(train[, 1:11], train[, 12],  :
number of items to replace is not a multiple of replacement length

How to fix it? 如何解决? Many thanks. 非常感谢。

There were a few small things, rbind instead of cbind , result vs. results , a names() conflict, indexing on the undefined object imp , etc: 有一些小事情,用rbind代替cbindresultresultsnames()冲突,在未定义对象imp上建立索引等:

data("mtcars")
train <- mtcars
require(randomForest)

result <- data.frame()
for (i in 1:3){
  imp    <- importance(randomForest(train[,2:10], y = train[,1], data = train,importance = TRUE, ntree =5000, proximity = TRUE, mtry=3))
  result <- rbind(result, imp)  
}
write.csv(result, "D:/vari.csv")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM