R 中 group_by 的排序、变异和汇总

Question

Data frame df has three columns: x , y , and n .数据框df具有三列： x 、 y和n 。 I want to create a new data frame that groups by x, counts the number of observations in y for that group x, and then sums the values for that group in n.我想创建一个按 x 分组的新数据框，计算该组 x 在 y 中的观察次数，然后将该组的值求和在 n 中。

df <- structure(list(x = c(1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5, 
5, 5), y = c(1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 4, 1, 2, 1, 2, 3), 
n = c(4L, 3L, 2L, 3L, 2L, 4L, 2L, 2L, 3L, 3L, 2L, 5L, 3L, 
3L, 2L, 3L)), class = "data.frame", row.names = c(NA, -16L))

The target data frame looks like this, where a are the 5 groups from original df :目标数据框如下所示，其中a是原始df的 5 个组：

> print(df2, row.names=FALSE)
 a b  c
 1 4 12
 2 3  8
 3 4 10
 4 2  8
 5 3  8

For some reason I'm not combining the group_by or mutate or summarize statements in the pipe in the right order to make this happen.出于某种原因，我没有以正确的顺序组合 pipe 中的group_by或mutate或summarize语句来实现这一点。 It feels like a simple solution I'm not seeing right now.感觉就像我现在没有看到的一个简单的解决方案。 If anyone could help I would appreciate.如果有人可以提供帮助，我将不胜感激。

Answer 1

Here is a data.table option这是一个data.table选项

> setDT(df)[, .(b = .N, c = sum(n)), x]
   x b  c
1: 1 4 12
2: 2 3  8
3: 3 4 10
4: 4 2  8
5: 5 3  8

Answer 2

Try this:尝试这个：

library(dplyr)
library(tidyr)
#Code
new <- df %>% group_by(x) %>%
  summarise(b=n(),c=sum(n,na.rm=T))

Output: Output：

# A tibble: 5 x 3
      x     b     c
  <dbl> <int> <int>
1     1     4    12
2     2     3     8
3     3     4    10
4     4     2     8
5     5     3     8

Answer 3

With base R , we can do有了base R ，我们可以做

do.call(rbind, by(df, df$x, FUN = function(x) 
     data.frame(b = length(x), c = sum(x$n, na.rm = TRUE))))

R 中 group_by 的排序、变异和汇总

问题描述

3 个解决方案

解决方案1
3 2020-12-24 00:52:16

解决方案2
2 已采纳 2020-12-24 00:46:57

解决方案3
1 2020-12-24 17:02:00

R 中 group_by 的排序、变异和汇总

问题描述

3 个解决方案

解决方案1 3 2020-12-24 00:52:16

解决方案2 2 已采纳 2020-12-24 00:46:57

解决方案3 1 2020-12-24 17:02:00

解决方案1
3 2020-12-24 00:52:16

解决方案2
2 已采纳 2020-12-24 00:46:57

解决方案3
1 2020-12-24 17:02:00