R tibble：按 A 列分组，仅保留 B 列和 C 列中的不同值，并在 C 列中汇总值

Question

I want to group by column A and then sum values in column C for distinct values in columns B and C .我想按A列分组，然后对C列中的值求和B和C列中的不同值。 Is it possible to do it inside summarise clause?是否可以在summarise子句中进行？ I know that's possible with distinct() function before aggregation.我知道在聚合之前使用distinct()函数是可能的。 What about something like that: Data:这样的事情怎么样：数据：

df <- tibble(A = c(1,1,1,2,2), B = c('a','b','b','a','a'), C=c(5,10,10,15,15))

My try that doesn't work:我的尝试不起作用：

df %>% 
group_by(A) %>% 
summarise(sumC=sum(distinct(B,C) %>% select(C)))

Desired ouput:期望输出：

A sumC
1 15
2 15

Answer 1

You could use duplicated你可以使用duplicated

df %>%
    group_by(A) %>%
    summarise(sumC = sum(C[!duplicated(B)]))
## A tibble: 2 x 2
#      A  sumC
#  <dbl> <dbl>
#1     1    15
#2     2    15

Or with distinct或者有distinct

df %>%
    group_by(A) %>%
    distinct(B, C) %>%
    summarise(sumC = sum(C))
## A tibble: 2 x 2
#      A  sumC
#  <dbl> <dbl>
#1     1    15
#2     2    15

Answer 2

A different possibility could be:另一种可能是：

df %>%
 group_by(A, B, C) %>%
 slice(1) %>%
 group_by(A) %>%
 summarise(sumC = sum(C))

      A  sumC
  <dbl> <dbl>
1     1    15
2     2    15

Or a twist on @Maurits Evers answer:或者对@Maurits Evers 的回答有所改动：

df %>%
 distinct(A, B, C) %>%
 group_by(A) %>%
 summarise(sumC = sum(C))

R tibble：按 A 列分组，仅保留 B 列和 C 列中的不同值，并在 C 列中汇总值

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-07-02 09:48:57

解决方案2
0 2019-07-02 09:50:05

R tibble：按 A 列分组，仅保留 B 列和 C 列中的不同值，并在 C 列中汇总值

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-07-02 09:48:57

解决方案2 0 2019-07-02 09:50:05

解决方案1
1 已采纳 2019-07-02 09:48:57

解决方案2
0 2019-07-02 09:50:05