简体   繁体   English

使用 purrr + map 过滤列表

[英]Filtering through a list using purrr + map

I have a dataframe on which i run multiple GLMs, using the whole data set.我有一个 dataframe,我使用整个数据集在其上运行多个 GLM。

  df <- data.frame(Var1 = sample(as.factor(0:1), replace = TRUE, 1000),
                   Var2 = runif(100),
                   Var3 = runif(100),
                   Var4 = runif(100),
                   Var5 = sample(as.factor(0:1), replace = TRUE, 1000),
                   Var6 = sample(as.factor(0:1), replace = TRUE, 1000))
  df %>%
  pivot_longer(cols = c("Var3","Var4")) %>%
  group_by(name)  %>%  nest() %>%
  mutate(model = map(data,~glm(Var1 ~ value, data = .x,family=binomial("logit"))))  %>%
  mutate(tidy= map(model, tidy)) %>%
  unnest(tidy)                                                                      

now i would like to use Var5 and Var6 to filter through my dataset.现在我想使用 Var5 和 Var6 来过滤我的数据集。

I would like a GLM for each of the 4 possible datasets Var5 2 * Var6 2我想要 4 个可能的数据集 Var5 2 * Var6 2 中的每一个的 GLM

With cross() i can get all combinations of the values of Var 5 and Var 6.使用 cross() 我可以获得 Var 5 和 Var 6 值的所有组合。

list <-  df %>% 
  expand(Var5,Var6) %>%
  cross()

now i would like to filter through my dataframe using the list.现在我想使用列表过滤我的 dataframe。 So i would like to run a GLM for each of the 4 possible dataframes.所以我想为 4 个可能的数据帧中的每一个运行一个 GLM。

eg in manual mode.例如在手动模式下。

  df %>%
  filter(Var5 == 1 & Var6 == 1) %>%
  pivot_longer(cols = c("Var3","Var4")) %>%
  group_by(name)  %>%  nest() %>%
  mutate(model = map(data,~glm(Var1 ~ value, data = .x, family=binomial("logit"))))  %>%
  mutate(tidy= map(model, tidy)) %>%
  unnest(tidy)

  df %>%
  filter(Var5 == 1 & Var6 == 0) %>%
  pivot_longer(cols = c("Var3","Var4")) %>%
  group_by(name)  %>%  nest() %>%
  mutate(model = map(data,~glm(Var1 ~ value, data = .x, family=binomial("logit"))))  %>%
  mutate(tidy= map(model, tidy)) %>%
  unnest(tidy)

ect...

i appreciate any advice you can give me on achieving this.我很感激你能给我的任何建议来实现这一点。

Below is an Approach using {dplyr}'s rowwise Notation (instead of purrr::map ).下面是使用 {dplyr} 的rowwise表示法(而不是purrr::map )的方法。

library(tidyverse)

df <- data.frame(Var1 = sample(as.factor(0:1), replace = TRUE, 1000),
                 Var2 = runif(100),
                 Var3 = runif(100),
                 Var4 = runif(100),
                 Var5 = sample(as.factor(0:1), replace = TRUE, 1000),
                 Var6 = sample(as.factor(0:1), replace = TRUE, 1000))

df %>%
  pivot_longer(cols = c("Var3","Var4")) %>%
  nest_by(name) %>% 
  crossing(Var5 = c(0,1), Var6 = c(0,1)) %>% 
  rowwise() %>% 
  mutate(data =  list(filter(data, Var5 == .env$Var5 & Var6 == .env$Var6)),
         model = list(glm(Var1 ~ value, data = data, family = binomial("logit")))) %>%
  mutate(tidy = list(broom::tidy(model))) %>%
  unnest(tidy)     
#> # A tibble: 16 x 10
#>    name  data       Var5  Var6 model term   estimate std.error statistic p.value
#>    <chr> <list>    <dbl> <dbl> <lis> <chr>     <dbl>     <dbl>     <dbl>   <dbl>
#>  1 Var3  <tibble ~     0     0 <glm> (Inte~  0.304       0.265    1.14     0.253
#>  2 Var3  <tibble ~     0     0 <glm> value  -0.321       0.432   -0.743    0.457
#>  3 Var3  <tibble ~     0     1 <glm> (Inte~  0.201       0.267    0.751    0.452
#>  4 Var3  <tibble ~     0     1 <glm> value  -0.357       0.436   -0.819    0.413
#>  5 Var3  <tibble ~     1     0 <glm> (Inte~ -0.197       0.252   -0.779    0.436
#>  6 Var3  <tibble ~     1     0 <glm> value   0.362       0.420    0.863    0.388
#>  7 Var3  <tibble ~     1     1 <glm> (Inte~ -0.428       0.261   -1.64     0.101
#>  8 Var3  <tibble ~     1     1 <glm> value   0.668       0.417    1.60     0.109
#>  9 Var4  <tibble ~     0     0 <glm> (Inte~  0.212       0.234    0.906    0.365
#> 10 Var4  <tibble ~     0     0 <glm> value  -0.166       0.403   -0.413    0.680
#> 11 Var4  <tibble ~     0     1 <glm> (Inte~ -0.00765     0.211   -0.0363   0.971
#> 12 Var4  <tibble ~     0     1 <glm> value   0.0328      0.357    0.0919   0.927
#> 13 Var4  <tibble ~     1     0 <glm> (Inte~  0.142       0.219    0.649    0.516
#> 14 Var4  <tibble ~     1     0 <glm> value  -0.316       0.378   -0.835    0.404
#> 15 Var4  <tibble ~     1     1 <glm> (Inte~  0.0682      0.222    0.307    0.759
#> 16 Var4  <tibble ~     1     1 <glm> value  -0.280       0.385   -0.728    0.467

Created on 2021-01-27 by the reprex package (v0.3.0)代表 package (v0.3.0) 于 2021 年 1 月 27 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM