[英]Aggregating all numeric variables of a list of data frames by all character variables except one
我有一个数据框/小标题列表,如下所示:
library(tidyverse)
l <- list(capacity = tribble(~plant, ~month, ~max_capacity, ~min_capacity,
"A", "202001", 3000.0, 5000.0,
"A", "202002", 2000.0, 4500.0,
"B", "202001", 5000.0, 8000.0),
demand = tribble(~region, ~month, ~demand,
"1", "202001", 234.3,
"1", "202002", 159.9,
"2", "202001", 488))
如何总结所有数字变量由除“月”之外的所有字符变量求和的所有数据框?
# want, but not in a dynamic way
l$capacity %>%
group_by(plant) %>% # group by all character variables except "month"
summarise(max_capacity = sum(max_capacity), min_capacity = sum(min_capacity)) %>%
# summarise all numeric variables
ungroup()
l$demand %>%
group_by(region) %>%
summarise(demand = sum(demand)) %>%
ungroup()
我们可以使用Filter
和setdiff
找到要分组的列,将其传递给group_by_at
接受字符串summarise_if
并使用 summarise_if 对数字列求和。
library(dplyr)
purrr::map(l, ~{
cols <- setdiff(names(Filter(is.character, .x)), 'month')
.x %>% group_by_at(cols) %>% summarise_if(is.numeric, sum)
})
#$capacity
# A tibble: 2 x 3
# plant max_capacity min_capacity
# <chr> <dbl> <dbl>
#1 A 5000 9500
#2 B 5000 8000
#$demand
# A tibble: 2 x 2
# region demand
# <chr> <dbl>
#1 1 394.
#2 2 488
请注意, summarise_if
很快将在across
的较新版本中被替换为dplyr
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.