purrr::map_dfr 按列绑定，而不是按预期行绑定

Question

I'm new to tidyverse and thus still struggling a bit to make it do stuff I knew how to do with base.我是 tidyverse 的新手，因此仍然在努力让它做我知道如何用 base 做的事情。

The issue: I want to loop through the columns of a data frame, input each of them separately into a lm call, and get the output as a tidy data frame.问题：我想遍历数据帧的列，将每个列分别输入到 lm 调用中，然后将输出作为整洁的数据帧获取。 I don't care for the intercept, so all I want to save into the tidy output are the coefficients from the independent variable.我不关心截距，所以我想保存到整洁输出中的是自变量的系数。 I want the final output to look as follows: a data frame where the columns are the coefficients and the rows are each variable from the original data frame.我希望最终输出如下所示：一个数据框，其中列是系数，行是原始数据框中的每个变量。 I can do it with base using do.call("rbind", ...) but as I'm migrating to tidyverse, I wanted to see if there's a way to do it on tidyverse.我可以使用 do.call("rbind", ...) 来完成它，但是当我迁移到 tidyverse 时，我想看看是否有办法在 tidyverse 上做到这一点。 purrr::map_dfr doesn't work on this case; purrr::map_dfr 在这种情况下不起作用； a known issue .一个已知问题。

Some reproducible code:一些可重现的代码：

> library(tidyverse)
> 
> set.seed(62442)
> 
> iv <- rnorm(100)
> dvs <- as_tibble(replicate(5, iv + rnorm(100)), .name_repair = "universal")
New names:
* `` -> ...1
* `` -> ...2
* `` -> ...3
* `` -> ...4
* `` -> ...5
> 
> # This doesn't work
> dvs %>% map_dfr(~ summary(lm(.x ~ iv))$coefficients[2, ]) 
# A tibble: 4 x 5
      ...1     ...2     ...3     ...4     ...5
     <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
1 8.78e- 1 1.09e+ 0 9.11e- 1 1.19e+ 0 8.80e- 1
2 1.05e- 1 1.17e- 1 9.86e- 2 9.33e- 2 1.16e- 1
3 8.34e+ 0 9.29e+ 0 9.24e+ 0 1.27e+ 1 7.60e+ 0
4 4.78e-13 4.16e-15 5.40e-15 1.97e-22 1.80e-11
> 
> # It behaves exactly like:
> dvs %>% map_dfc(~ summary(lm(.x ~ iv))$coefficients[2, ])
# A tibble: 4 x 5
      ...1     ...2     ...3     ...4     ...5
     <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
1 8.78e- 1 1.09e+ 0 9.11e- 1 1.19e+ 0 8.80e- 1
2 1.05e- 1 1.17e- 1 9.86e- 2 9.33e- 2 1.16e- 1
3 8.34e+ 0 9.29e+ 0 9.24e+ 0 1.27e+ 1 7.60e+ 0
4 4.78e-13 4.16e-15 5.40e-15 1.97e-22 1.80e-11
> 
> # All is left for me to do is:
> res <- dvs %>% map(~ summary(lm(.x ~ iv))$coefficients[2, ])
> do.call("rbind", res)
      Estimate Std. Error   t value                       Pr(>|t|)
...1 0.8776895 0.10525549  8.338658 0.0000000000004779501411861117
...2 1.0911362 0.11742588  9.292127 0.0000000000000041631074216992
...3 0.9113473 0.09863111  9.239958 0.0000000000000054021858298938
...4 1.1852848 0.09330950 12.702724 0.0000000000000000000001970469
...5 0.8799633 0.11579113  7.599575 0.0000000000179548788283525966

Answer 1

map row bind works when the datasets are data.frame/tibble or list s.当数据集是data.frame/tibble或list时， map行绑定有效。 Here, it is a named vector.在这里，它是一个命名向量。 One option is to convert it to list with as.list一种选择是使用as.list将其转换为list

library(dplyr)
library(purrr)
dvs %>% 
    map_dfr(~ summary(lm(.x ~ iv))$coefficients[2, ] %>% as.list)
# A tibble: 5 x 4
#  Estimate `Std. Error` `t value` `Pr(>|t|)`
#*    <dbl>        <dbl>     <dbl>      <dbl>
#1    0.878       0.105       8.34   4.78e-13
#2    1.09        0.117       9.29   4.16e-15
#3    0.911       0.0986      9.24   5.40e-15
#4    1.19        0.0933     12.7    1.97e-22
#5    0.880       0.116       7.60   1.80e-11

Answer 2

With the addition of broom , you can try:添加了broom ，您可以尝试：

map_dfr(.x = dvs, ~ tidy(lm(.x ~ iv)), .id = "ID")

   ID    term          estimate std.error statistic   p.value
   <chr> <chr>            <dbl>     <dbl>     <dbl>     <dbl>
 1 ...1  (Intercept) -0.260        0.0999 -2.61      1.05e- 2
 2 ...1  iv           0.878        0.105   8.34      4.78e-13
 3 ...2  (Intercept) -0.0000159    0.111  -0.000142 10.00e- 1
 4 ...2  iv           1.09         0.117   9.29      4.16e-15
 5 ...3  (Intercept) -0.0383       0.0936 -0.410     6.83e- 1
 6 ...3  iv           0.911        0.0986  9.24      5.40e-15
 7 ...4  (Intercept) -0.131        0.0885 -1.48      1.41e- 1
 8 ...4  iv           1.19         0.0933 12.7       1.97e-22
 9 ...5  (Intercept) -0.0132       0.110  -0.120     9.05e- 1
10 ...5  iv           0.880        0.116   7.60      1.80e-11

And if you don't need the intercept, with the addition of dplyr :如果你不需要拦截，加上dplyr ：

map_dfr(.x = dvs, ~ tidy(lm(.x ~ iv)), .id = "ID") %>%
 filter(term != "(Intercept)")

  ID    term  estimate std.error statistic  p.value
  <chr> <chr>    <dbl>     <dbl>     <dbl>    <dbl>
1 ...1  iv       0.878    0.105       8.34 4.78e-13
2 ...2  iv       1.09     0.117       9.29 4.16e-15
3 ...3  iv       0.911    0.0986      9.24 5.40e-15
4 ...4  iv       1.19     0.0933     12.7  1.97e-22
5 ...5  iv       0.880    0.116       7.60 1.80e-11

purrr::map_dfr 按列绑定，而不是按预期行绑定

问题描述

2 个解决方案

解决方案1
4 2020-02-25 21:03:56

解决方案2
2 已采纳 2020-02-25 21:03:44

purrr::map_dfr 按列绑定，而不是按预期行绑定

问题描述

2 个解决方案

解决方案1 4 2020-02-25 21:03:56

解决方案2 2 已采纳 2020-02-25 21:03:44

解决方案1
4 2020-02-25 21:03:56

解决方案2
2 已采纳 2020-02-25 21:03:44