[英]How to sum values for each unique group in R
在下面的數據集中,我想確定Top 3
個耗時的項目
library(dplyr)
TransID <-c(1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1014,1018,1022,1023,1024)
EmpID<-c('M001','M001','M001','M001','B005','B005','B005','B005','X101','X101','X101','Z101','K501','K501','K501','K501')
ProjectID <- c(200,200,200,200,500,500,500,500,950,950,950,950,1050,1050,1050,1050)
Site<-c('X','X','X','Y','Y','Y','Z','Z','Z','G','G','G','G','K','K','K')
Region <-c('NE','NW','SE','SW','MW','NW','SW','NE','NC','MW','NE','SE','SW','NC','SW','SE')
hour_difference<-c(1.45,2.14,2.53,3.69,1.73,2.47,3.63,1.59,0.75,1.18,2.78,9.55,1.85,2.39,5.52,0.23)
df = data.frame(TransID,EmpID,ProjectID,Site,Region,hour_difference)
df
簡單地,
ProjectID
,我要總結的hour_difference
和sort
按降序排列我的嘗試:
df %>%
group_by(ProjectID,hour_difference) %>%
summarize(sum().sort_values())
期望的輸出:
例如, ProjectID = 950
的總和為14.26
我對ProjectID
降序或hour_difference
總和感到困惑,但您可以嘗試
sum(hour_difference)
df %>%
group_by(ProjectID) %>%
summarise(res = sum(hour_difference)) %>%
arrange(desc(res))
ProjectID res
<dbl> <dbl>
1 950 14.3
2 1050 9.99
3 200 9.81
4 500 9.42
ProjectID
df %>%
group_by(ProjectID) %>%
summarise(res = sum(hour_difference)) %>%
arrange(desc(ProjectID))
ProjectID res
<dbl> <dbl>
1 1050 9.99
2 950 14.3
3 500 9.42
4 200 9.81
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.