简体   繁体   English

如何使用 R 中的 tidymodels 调整后的 model 预测测试集的置信区间?

[英]How to predict the test set's confidence interval using a tuned model from tidymodels in R?

I am fitting a random forest model using tidymodels in R, and an error occurs when I try to predict the test set using the tuned model: Each element of splits must be an rsplit object.我在 R 中使用tidymodels拟合随机森林 model,当我尝试使用调整后的 model 预测测试集时出现错误: splits的每个元素必须是一个rsplit object。

# Data splitting
data(Sacramento, package = "modeldata")
set.seed(123)
data_split <- initial_split(Sacramento, prop = 0.75, strata = price)
Sac_train <- training(data_split)
Sac_test <- testing(data_split)

# Build the model
rf_mod <- rand_forest(mtry = tune(), min_n = tune(), trees = 1000) %>% 
          set_engine("ranger", importance = "permutation") %>% 
          set_mode("regression")

# Create the recipe
Sac_recipe <- recipe(price ~ ., data = Sac_train) %>% 
              step_rm(zip, latitude, longitude) %>% 
              step_corr(all_numeric_predictors(), threshold = 0.85) %>% 
              step_zv(all_numeric_predictors()) %>% 
              step_normalize(all_numeric_predictors()) %>%
              step_dummy(all_nominal_predictors())

# Create the workflow
rf_workflow <- workflow() %>% 
               add_model(rf_mod) %>% 
               add_recipe(Sac_recipe)

# Train and Tune the model
set.seed(123)
Sac_folds <- vfold_cv(Sac_train, v = 10, repeats = 2, strata = price)

rf_res <- rf_workflow %>% 
          tune_grid(grid = 2*2,
                    resamples = Sac_folds, 
                    control = control_grid(save_pred = TRUE),
                    metrics = metric_set(rmse))

# Extract the best model
rf_best <- rf_res %>%
           select_best(metric = "rmse")

# Last fit
last_rf_workflow <- rf_workflow %>% 
                    finalize_workflow(rf_best)

last_rf_fit <- last_rf_workflow %>% 
               last_fit(Sac_train)
# Error: Each element of `splits` must be an `rsplit` object.

predict(last_rf_fit, Sac_test, type = "conf_int")

The error generates from these lines,错误从这些行产生,

last_rf_fit <- last_rf_workflow %>% 
               last_fit(Sac_train)

Now from the documentation of last_fit ,现在从last_fit的文档中,

# S3 method for workflow
last_fit(object, split, ..., metrics = NULL, control = control_last_fit())

So an workflow object is passed to last_fit as the first argument via %>% and Sac_train is passed to split parameter.因此, workflow last_fit作为第一个参数通过%>%传递给 last_fit, Sac_train传递给split参数。

But from the docs, the split argument needs to be,但是从文档来看, split参数需要是,

An rsplit object created from rsample::initial_split()rsample::initial_split()创建的 rsplit object

Instead, try this,相反,试试这个,

last_rf_fit <- last_rf_workflow %>% 
  last_fit(data_split)

Then to collect the predictions, following the docs ,然后 按照文档收集预测,

collect_predictions(last_rf_fit)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用R中的机器学习和Caret包在新数据集上测试调整后的SVM模型? - How to test your tuned SVM model on a new data-set using machine learning and Caret Package in R? 如何使用 Nest 和 mutate 从训练集中创建 model,然后将其应用于 R 中的测试数据(tidymodels) - How to use Nest and mutate to create a model from training set and then apply it on a test data in R (tidymodels) 如何根据线性模型在r中创建置信区间? - How can I create a confidence interval in r from a linear model? R:如何从cor.test函数中提取置信区间 - R: How to extract confidence interval from cor.test function 如何计算使用 R 中的 CARET 训练的模型的 95% 置信区间? - How to calculate the 95% confidence interval from a model trained using CARET in R? 如何计算 predict() R function 的预测差异的置信区间? - How to calculate the confidence interval for the difference in predictions of the predict() R function? 将模型应用于测试数据集以使用 Caret&#39;s Train 方法预测 R 中的标签的问题 - Problems on applying model to a test data set to predict label in R using Caret's Train method 如何使用R获得函数的置信区间 - How to get the confidence interval for a function using R 使用具有置信区间的lm列表预测 - Using predict on lm list with confidence interval 使用R中的模拟测试置信区间的覆盖概率 - Using a simulation in R to test coverage probability of a confidence interval
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM