简体   繁体   English

在一个 dplyr 链中汇总然后汇总_at?

[英]summarise and then summarise_at in one dplyr chain?

I have a data frame of clustered data which I'm aggregating by cluster to provide summary data on.我有一个集群数据的数据框,我按集群聚合它以提供摘要数据。

I would like to create a new column based on cluster count n() and then do mean and sum over a list of vars:我想创建一个基于簇数 n() 的新列,然后对 var 列表进行均值和求和:

# works fine
nums <- c("mpg", "disp", "cyl")
mtcars %>% group_by(carb) %>% summarise(cnt = n())

Looks like this:看起来像这样:

# A tibble: 6 x 2
   carb   cnt
  <dbl> <int>
1     1     7
2     2    10
3     3     3
4     4    10
5     6     1
6     8     1

# does not work, returns error message:

> Error in summarise_impl(.data, dots) :    Evaluation error: object
> 'disp' not found. In addition: Warning message: In mean.default(mpg) :
> argument is not numeric or logical: returning NA

nums <- c("mpg", "disp", "cyl")
mtcars %>% group_by(carb) %>% summarise(cnt = n()) %>% summarise_at(.vars = nums,
                                                                    funs(mean, sum))

Goal is to have the tbl above but with new column cnt being the count of observations in each group.目标是获得上面的 tbl,但新列 cnt 是每个组中的观察计数。

We can mutate to create the 'cn't by 'carb', then add 'cnt' also as the grouping variable before doing the summarise_at我们可以mutate以通过 'carb' 创建 'cn't,然后在执行summarise_at之前添加 'cnt' 作为分组变量

mtcars %>% 
   group_by(carb) %>% 
   mutate(cnt = n()) %>%
   group_by(cnt, add = TRUE) %>% 
   summarise_at(.vars = nums, funs(mean, sum))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM