简体   繁体   English

R:如何将变量汇总和分组作为列名

[英]R: How to summarize and group by variables as column names

I have a wide dataframe with about 200 columns and want to summarize it over various columns.我有一个很宽的 dataframe 大约有 200 列,并想在各个列中总结它。 I can not figure the syntax for this, I think it should work with.data$ and.env$ but I don't get it.我想不出它的语法,我认为它应该与 .data$ 和 .env$ 一起使用,但我不明白。 Heres an example:这是一个例子:

> library(dplyr)
> df = data.frame('A'= c('X','X','X','Y','Y'), 'B'= 1:5, 'C' = 6:10)
> df
  A B  C
1 X 1  6
2 X 2  7
3 X 3  8
4 Y 4  9
5 Y 5 10
> df %>% group_by(A) %>% summarise(sum(B), sum(C))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 2 x 3
  A     `sum(B)` `sum(C)`
  <chr>    <int>    <int>
1 X            6       21
2 Y            9       19

But I want to be able to do something like this:但我希望能够做这样的事情:

columns_to_sum = c('B','C')
columns_to_group = c('A')
df %>% group_by(colums_to_group)%>% summarise(sum(columns_to_sum))

We can use across from the new version of dplyr我们可以across新版使用dplyr

library(dplyr)
df %>%
    group_by(across(colums_to_group)) %>% 
    summarise(across(all_of(columns_to_sum), sum, na.rm = TRUE), .groups = 'drop')
# A tibble: 2 x 3
#  A         B     C
#  <chr> <int> <int>
#1 X         6    21
#2 Y         9    19

In the previous version, we could use group_by_at along with summarise_at在之前的版本中,我们可以使用group_by_atsummarise_at

df %>%
    group_by_at(colums_to_group) %>%
    summarise_at(vars(columns_to_sum), sum, na.rm = TRUE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM