[英]How to use Nest and mutate to create a model from training set and then apply it on a test data in R (tidymodels)
library(tidymodels)
Train %>% nest(-Groups) %>%
mutate(fit=map(data,~lm(X~Y+Z,x=.)),
augmented = map(fit,augment),
predict = map2(fit,Y,Z)) %>%
unnest(augmented) %>% select(-data)
This works perfectly with the Train data.这与训练数据完美配合。 I can get fitted, model summary etc by using different broom functionalities like glance or augment.
我可以通过使用不同的扫帚功能(如扫视或增强)来安装 model 总结等。 And each group has a model of its own the way I wnated.
每个组都有自己的 model,就像我所拥有的那样。
The challenge is when I want to use this model on the test data.挑战是当我想在测试数据上使用这个 model 时。
Seems straight forward but somehow the solution eludes me:(似乎直截了当,但不知何故,解决方案让我望而却步:(
When you fit to nested data like that, you end up with many models, not just one, so you will need to also to set yourself up to predict on many models .当您适合这样的嵌套数据时,您最终会得到许多模型,而不仅仅是一个,因此您还需要设置自己以预测许多模型。
library(tidyverse)
library(broom)
data(Orange)
Orange <- as_tibble(Orange)
orange_fit <- Orange %>%
nest(data = c(-Tree)) %>% ## this sets up five separate models
mutate(
fit = map(data, ~ lm(age ~ circumference, data = .x))
)
## the "test data" here is `circumference = c(50, 100, 150)`
orange_fit %>%
select(Tree, fit) %>%
crossing(circumference = c(50, 100, 150)) %>%
mutate(new_data = map(circumference, ~tibble(circumference = .)),
predicted_age = map2_dbl(fit, new_data, predict))
#> # A tibble: 15 x 5
#> Tree fit circumference new_data predicted_age
#> <ord> <list> <dbl> <list> <dbl>
#> 1 3 <lm> 50 <tibble [1 × 1]> 392.
#> 2 3 <lm> 100 <tibble [1 × 1]> 994.
#> 3 3 <lm> 150 <tibble [1 × 1]> 1596.
#> 4 1 <lm> 50 <tibble [1 × 1]> 331.
#> 5 1 <lm> 100 <tibble [1 × 1]> 927.
#> 6 1 <lm> 150 <tibble [1 × 1]> 1523.
#> 7 5 <lm> 50 <tibble [1 × 1]> 385.
#> 8 5 <lm> 100 <tibble [1 × 1]> 824.
#> 9 5 <lm> 150 <tibble [1 × 1]> 1264.
#> 10 2 <lm> 50 <tibble [1 × 1]> 257.
#> 11 2 <lm> 100 <tibble [1 × 1]> 647.
#> 12 2 <lm> 150 <tibble [1 × 1]> 1037.
#> 13 4 <lm> 50 <tibble [1 × 1]> 282.
#> 14 4 <lm> 100 <tibble [1 × 1]> 640.
#> 15 4 <lm> 150 <tibble [1 × 1]> 999.
Created on 2021-01-25 by the reprex package (v0.3.0)由代表 package (v0.3.0) 于 2021 年 1 月 25 日创建
Notice at the end we have a prediction for each point in the test set (3) for each model (5).请注意,最后我们对每个model (5) 的测试集 (3) 中的每个点都有一个预测。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.