简体   繁体   English

使用 dplyr 按行创建多个线性模型

[英]Using dplyr rowwise to create multiple linear models

considering this post: https://www.tidyverse.org/blog/2020/06/dplyr-1-0-0/考虑这篇文章: https://www.tidyverse.org/blog/2020/06/dplyr-1-0-0/

I was trying to create multiple models for a data set, using multiple formulas.我试图使用多个公式为数据集创建多个模型。 this example says:这个例子说:

library(dplyr, warn.conflicts = FALSE)

models <- tibble::tribble(
  ~model_name,    ~ formula,
  "length-width", Sepal.Length ~ Petal.Width + Petal.Length,
  "interaction",  Sepal.Length ~ Petal.Width * Petal.Length
)

iris %>% 
  nest_by(Species) %>% 
  left_join(models, by = character()) %>% 
  rowwise(Species, model_name) %>% 
  mutate(model = list(lm(formula, data = data))) %>% 
  summarise(broom::glance(model))

You can see rowwise function is used to get the answer but when i dont use this function, i still get the correct answer你可以看到rowwise function 用于得到答案,但是当我不使用这个 function 时,我仍然得到正确的答案

iris %>%
  nest_by(Species) %>% 
  left_join(models, by = character()) %>% 
  mutate(model = list(lm(formula, data = data))) %>% 
  summarise(broom::tidy(model))

i only lost the "model_name" column, but considering that rowwise documentation says, this function is to compute , i dont get why is still computed this way, why this happens?我只丢失了“model_name”列,但考虑到rowwise文档说,这个 function 是要计算的,我不明白为什么仍然以这种方式计算,为什么会发生这种情况?

thanks in advance.提前致谢。

considering https://cran.r-project.org/web/packages/dplyr/vignettes/rowwise.html考虑https://cran.r-project.org/web/packages/dplyr/vignettes/rowwise.html

You can optionally supply “identifier” variables in your call to rowwise().您可以选择在对 rowwise() 的调用中提供“标识符”变量。 These variables are preserved when you call summarise(), so they behave somewhat similarly to the grouping variables passed to group_by():这些变量在您调用 summarise() 时会被保留,因此它们的行为与传递给 group_by() 的分组变量有些相似:

i didn't understand how identifiers works, so as far i get this "identifiers" (Species,model_name) doesn't affect how to compute a value, only the way your tibble is presented.我不明白标识符是如何工作的,所以到目前为止我得到这个“标识符”(物种,模型名称)不会影响如何计算一个值,只会影响你的小标题的呈现方式。

So if you have a rowwise tibble created by nest_by you dont need the rowwise() function to compute by row.因此,如果您有由 nest_by 创建的nest_by ,则不需要rowwise() function 逐行计算。 So in my example, rowwise function only give you a extra column of information but linear model is still the same.因此,在我的示例中, rowwise只会给您额外的信息列,但线性 model 仍然相同。 this is just for a "elegant way", it doesn't change the way its computed.这只是为了一种“优雅的方式”,它不会改变它的计算方式。

Thanks to tmfmnk感谢 tmfmnk

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM