[英]R How to summarize two different groups after initial group_by
I have the following I would like to do in one go instead of making two different results and doing a union:我想一次性完成以下操作,而不是产生两个不同的结果并进行联合:
delivery_stats= data.frame(service=c("UberEats", "Seamless","UberEats", "Seamless"),
status = c("OnTime", "OnTime", "Late", "Late"),
totals = c(235, 488, 32, 58))
ds1 = filter(delivery_stats, service =="UberEats") %>%
group_by(service, status) %>%
summarise(count_status = sum(totals)) %>%
mutate(avg_of_status = count_status/sum(count_status))
#now do the same for Seamless, then union...
Provided I have understood you correctly, do you mean this?如果我理解正确,你是这个意思吗?
delivery_stats %>%
group_by(service) %>%
mutate(n = sum(totals)) %>%
transmute(
status,
count_status = totals,
avg_of_status = count_status/n)
## A tibble: 4 x 4
## Groups: service, status [4]
# service status count_status avg_of_status
# <fct> <fct> <dbl> <dbl>
#1 UberEats OnTime 235 0.880
#2 Seamless OnTime 488 0.894
#3 UberEats Late 32 0.120
#4 Seamless Late 58 0.106
Explanation: First group by service
to calculate the sum of totals
by service
;说明:由第一组
service
来计算的总和totals
由service
; then group by service
and status
to calculate the mean (across service
) of count_status = totals
.然后按
service
和status
分组以计算count_status = totals
的平均值(跨service
)。
You also try base R using ave
with the help of within
.您也尝试使用基础R
ave
的帮助下within
。
res <- within(delivery_stats, {
count_status <- ave(totals, service, status, FUN=mean)
avg_of_status <- count_status / ave(totals, service, FUN=sum)
})
res
# service status totals avg_of_status count_status
# 1 UberEats OnTime 235 0.8801498 235
# 2 Seamless OnTime 488 0.8937729 488
# 3 UberEats Late 32 0.1198502 32
# 4 Seamless Late 58 0.1062271 58
As said above, I didn't have to filter and it would have worked fine for both groups:如上所述,我不必过滤,它对两个组都可以正常工作:
delivery_stats= data.frame(service=c("UberEats", "Seamless","UberEats", "Seamless"),
status = c("OnTime", "OnTime", "Late", "Late"),
totals = c(235, 488, 32, 58))
ds1 = group_by(delivery_stats, service, status) %>%
summarise(count_status = sum(totals)) %>%
mutate(avg_of_status = count_status/sum(count_status))
# A tibble: 4 x 4
# Groups: service [2]
service status count_status avg_of_status
<fct> <fct> <dbl> <dbl>
1 Seamless Late 58 0.106
2 Seamless OnTime 488 0.894
3 UberEats Late 32 0.120
4 UberEats OnTime 235 0.880
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.