r - 根据另一列上的不同值求和

Question

I'm looking for a tidyverse solution to sum a column based on unique values of an ID column, while still summing other columns based on all values.我正在寻找一种 tidyverse 解决方案来根据 ID 列的唯一值对列进行求和，同时仍根据所有值对其他列求和。

Example data:示例数据：

   dat <- data.frame(
        manager = c("Adam", "Adam", "Adam", "Bill", "Bill", "Charlie", "Dan"),
        manager_age = c(30, 30, 30, 33, 33, 35, 35),
        sales = c(4, 12, 7, 4, 2, 15, 10))
   dat

  manager manager_age sales
1    Adam          30     4
2    Adam          30    12
3    Adam          30     7
4    Bill          33     4
5    Bill          33     2
6 Charlie          35    15
7     Dan          35    10

I want to sum all values of sales but only sum one value per manager for manager_age .我想总结所有的销售价值，但每个经理只为manager_age总结一个价值。

Desired output:所需的 output：

  unique_managers total_sales total_age
               4          54      133

I'm most of the way there, but need help with the summed age:我大部分时间都在那里，但在总年龄方面需要帮助：

results <- dat %>%summarize(unique_managers = n_distinct(manager), total_sales = sum(sales))
results

Thanks in advance!提前致谢！

Edit: Updated example data to include two managers with same age.编辑：更新了示例数据以包括两个年龄相同的经理。

Answer 1

This should do it for you:这应该为你做：

library(dplyr)   

results <- dat %>% 
  summarize(unique_managers = n_distinct(manager),
            total_sales = sum(sales)) %>% 
  cbind(dat %>% 
          select(manager, manager_age) %>% 
          group_by(manager) %>% 
          unique() %>% 
          ungroup() %>% 
          summarize(total_age = sum(manager_age)))

Which gives us:这给了我们：

> results
  unique_managers total_sales total_age
1               3          44        98

Edit:编辑：

If you have two managers with the same age:如果您有两个年龄相同的经理：

dat <- data.frame(
  manager = c("Adam", "Adam", "Adam", "Bill", "Bill", "Charlie", "Dante"),
  manager_age = c(30, 30, 30, 33, 33, 35, 30),
  sales = c(4, 12, 7, 4, 2, 15, 14))

Gives us:给我们：

  unique_managers total_sales total_age
1               4          58       128

Answer 2

Attention: in order to avoid the case that different person have the same age, the unique operation on age should be within each personal group firstly.注意：为了避免不同人年龄相同的情况，年龄的unique操作应该首先在每个人的组内进行。

library(data.table)
dat <- data.frame(
    manager = c("Adam", "Adam", "Adam", "Bill", "Bill", "Charlie"),
    manager_age = c(30, 30, 30, 33, 33, 35),
    sales = c(4, 12, 7, 4, 2, 15))

setDT(dat)[,.(managers=.NGRP,
       sales = sum(sales),
       age=unique(manager_age)),
       by=manager][,.(unique_managers = unique(managers),
                      total_sales = sum(sales),
                      total_age = sum(age))]
#>    unique_managers total_sales total_age
#> 1:               3          44        98

^{Created on 2021-05-04 by the reprex package (v2.0.0)}^{由代表 package (v2.0.0) 于 2021 年 5 月 4 日创建}

r - 根据另一列上的不同值求和

问题描述

2 个解决方案

解决方案1
0 2021-05-04 15:31:14

解决方案2
0 2021-05-04 15:34:44

r - 根据另一列上的不同值求和

问题描述

2 个解决方案

解决方案1 0 2021-05-04 15:31:14

解决方案2 0 2021-05-04 15:34:44

解决方案1
0 2021-05-04 15:31:14

解决方案2
0 2021-05-04 15:34:44