简体   繁体   English

如何使用 Nest 和 mutate 从训练集中创建 model,然后将其应用于 R 中的测试数据(tidymodels)

[英]How to use Nest and mutate to create a model from training set and then apply it on a test data in R (tidymodels)

library(tidymodels)

Train %>% nest(-Groups) %>% 
        mutate(fit=map(data,~lm(X~Y+Z,x=.)),
               augmented = map(fit,augment),
               predict = map2(fit,Y,Z)) %>%
        unnest(augmented) %>% select(-data)

This works perfectly with the Train data.这与训练数据完美配合。 I can get fitted, model summary etc by using different broom functionalities like glance or augment.我可以通过使用不同的扫帚功能(如扫视或增强)来安装 model 总结等。 And each group has a model of its own the way I wnated.每个组都有自己的 model,就像我所拥有的那样。

The challenge is when I want to use this model on the test data.挑战是当我想在测试数据上使用这个 model 时。

Seems straight forward but somehow the solution eludes me:(似乎直截了当,但不知何故,解决方案让我望而却步:(

When you fit to nested data like that, you end up with many models, not just one, so you will need to also to set yourself up to predict on many models .当您适合这样的嵌套数据时,您最终会得到许多模型,而不仅仅是一个,因此您还需要设置自己以预测许多模型

library(tidyverse)
library(broom)

data(Orange)

Orange <- as_tibble(Orange)

orange_fit <- Orange %>% 
  nest(data = c(-Tree)) %>%    ## this sets up five separate models
  mutate(
    fit = map(data, ~ lm(age ~ circumference, data = .x))
  ) 

## the "test data" here is `circumference = c(50, 100, 150)`
orange_fit %>%
  select(Tree, fit) %>%
  crossing(circumference = c(50, 100, 150)) %>%
  mutate(new_data = map(circumference, ~tibble(circumference = .)),
         predicted_age = map2_dbl(fit, new_data, predict))
#> # A tibble: 15 x 5
#>    Tree  fit    circumference new_data         predicted_age
#>    <ord> <list>         <dbl> <list>                   <dbl>
#>  1 3     <lm>              50 <tibble [1 × 1]>          392.
#>  2 3     <lm>             100 <tibble [1 × 1]>          994.
#>  3 3     <lm>             150 <tibble [1 × 1]>         1596.
#>  4 1     <lm>              50 <tibble [1 × 1]>          331.
#>  5 1     <lm>             100 <tibble [1 × 1]>          927.
#>  6 1     <lm>             150 <tibble [1 × 1]>         1523.
#>  7 5     <lm>              50 <tibble [1 × 1]>          385.
#>  8 5     <lm>             100 <tibble [1 × 1]>          824.
#>  9 5     <lm>             150 <tibble [1 × 1]>         1264.
#> 10 2     <lm>              50 <tibble [1 × 1]>          257.
#> 11 2     <lm>             100 <tibble [1 × 1]>          647.
#> 12 2     <lm>             150 <tibble [1 × 1]>         1037.
#> 13 4     <lm>              50 <tibble [1 × 1]>          282.
#> 14 4     <lm>             100 <tibble [1 × 1]>          640.
#> 15 4     <lm>             150 <tibble [1 × 1]>          999.

Created on 2021-01-25 by the reprex package (v0.3.0)代表 package (v0.3.0) 于 2021 年 1 月 25 日创建

Notice at the end we have a prediction for each point in the test set (3) for each model (5).请注意,最后我们对每个model (5) 的测试集 (3) 中的每个点都有一个预测。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 R 中的 tidymodels 调整后的 model 预测测试集的置信区间? - How to predict the test set's confidence interval using a tuned model from tidymodels in R? Tidymodels:如何从训练数据中获得额外的重要性 - Tidymodels: How to extra importance from training data 如何在R中创建平衡的训练和不平衡的测试数据集? - How to create a balanced training and an unbalanced test data set in R? 如何在 R 的 tidymodels 中使用 %>%? - How to use %>% in tidymodels in R? 如何使用测试数据计算 R 中训练 model 的 MSE? - How can I use test data to calculate the MSE for a training model in R? R-根据训练集和测试集的训练模型,计算测试MSE - R - Calculate Test MSE given a trained model from a training set and a test set mlr3:如何使用 mlr 对训练数据集进行过滤并将结果应用于 model 训练? - mlr3: How to filter with mlr on training data set and apply results to model training? 在 R 中如何使用 mutate with glue 创建列 - In R how to use mutate with glue to create columns tidymodels 如何为朴素贝叶斯 model 设置先验 - tidymodels how to set priors for a Naive Bayes model 从 Tidymodels 中的拟合工作流中获取训练数据的 AUC? - Get AUC on training data from a fitted workflow in Tidymodels?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM