简体   繁体   English

R:随机森林调优超参数的有效方法

[英]R: Efficient Approach for Random Forest tuning of hyper parameters

I have the following random forest (regression) model with the default parameters我有以下带有默认参数的随机森林(回归)model

set.seed(42)

# Define train control
trControl <- trainControl(method = "cv",
    number = 10,
    search = "grid")

# Random Forest (regression) model
rf_reg <- train(Price.Gas~., data=data_train,
                    method = "rf",
                    metric = "RMSE",
                    trControl = trControl)

This is the output plot of the true values (black) and the predicted values(red)这是真实值(黑色)和预测值(红色)的 output plot 在此处输入图像描述

I'd like the model to perform better by changing its tunning parameters (eg ntree, maxnodes, search, etc).我希望 model 通过更改其调整参数(例如 ntree、maxnodes、搜索等)来表现更好。

I don't think changing one by one is the most efficient way of doing this.我不认为一一改变是最有效的方法。

How could I efficiently test the parameters in R to obtain a better random forest (ie one that predicts the data well)?我怎样才能有效地测试 R 中的参数以获得更好的随机森林(即能够很好地预测数据的随机森林)?

You will to perform some sort of hyperparameter search (grid or random) where you list all values you want to test (or sequences) and then compute all of them to obtain the best configuration.您将执行某种超参数搜索(网格或随机),在其中列出要测试的所有值(或序列),然后计算所有值以获得最佳配置。 This links explains the possible aproaches with caret: https://rpubs.com/phamdinhkhanh/389752此链接解释了插入符号的可能方法: https://rpubs.com/phamdinhkhanh/389752

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM