简体   繁体   中英

How to sum values for each unique group in R

In the dataset below, I want to identify Top 3 time-consuming projects

library(dplyr)
TransID <-c(1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1014,1018,1022,1023,1024)
EmpID<-c('M001','M001','M001','M001','B005','B005','B005','B005','X101','X101','X101','Z101','K501','K501','K501','K501')
ProjectID <- c(200,200,200,200,500,500,500,500,950,950,950,950,1050,1050,1050,1050)
Site<-c('X','X','X','Y','Y','Y','Z','Z','Z','G','G','G','G','K','K','K')
Region <-c('NE','NW','SE','SW','MW','NW','SW','NE','NC','MW','NE','SE','SW','NC','SW','SE')
hour_difference<-c(1.45,2.14,2.53,3.69,1.73,2.47,3.63,1.59,0.75,1.18,2.78,9.55,1.85,2.39,5.52,0.23)

df = data.frame(TransID,EmpID,ProjectID,Site,Region,hour_difference)
df

Simply,

  1. for each unique ProjectID , I want to sum the hour_difference and sort in descending order

My attempt:

df %>%
  group_by(ProjectID,hour_difference) %>%
  summarize(sum().sort_values())

Desired output:

for example, ProjectID = 950 will have a sum of 14.26

I'm confused about descending order of ProjectID or sum of hour_difference but you may try

sum(hour_difference)

df %>%
  group_by(ProjectID) %>%
  summarise(res = sum(hour_difference)) %>%
  arrange(desc(res))

  ProjectID   res
      <dbl> <dbl>
1       950 14.3 
2      1050  9.99
3       200  9.81
4       500  9.42

ProjectID

df %>%
  group_by(ProjectID) %>%
  summarise(res = sum(hour_difference)) %>%
  arrange(desc(ProjectID))

  ProjectID   res
      <dbl> <dbl>
1      1050  9.99
2       950 14.3 
3       500  9.42
4       200  9.81

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM