[英]Obtain the aggregated frequencies from data frame in R
I have a data frame which contains three items and one column for the frequency over different time periods as follow: 我有一个数据框,其中包含三个项目和一列,分别表示不同时间段内的频率,如下所示:
Col1 Col2 Col3 FREQUENCY INTERVAL
A item1 CLASS1 4 1
A item2 CLASS2 10 1
B item2 CLASS1 5 1
B item3 CLASS3 2 1
A item1 CLASS1 8 2
C item4 CLASS2 9 2
B item2 CLASS1 3 3
C item4 CLASS2 7 3
Now I want to aggregate the frequencies for the first three columns, I tried: df<-%>% count(col1,col2,col3,sort =TRUE)
but it did not work in this situation. 现在,我想汇总前三列的频率,我尝试了:
df<-%>% count(col1,col2,col3,sort =TRUE)
但是在这种情况下不起作用。 The expected result is: 预期结果是:
Col1 Col2 Col3 TOTAL_FREQUENCY
A item1 CLASS1 12
A item2 CLASS2 10
B item2 CLASS1 8
B item3 CLASS3 2
C item4 CLASS2 16
any suggestion? 有什么建议吗?
A solution using dplyr
. 使用
dplyr
的解决方案。 We can also replace group_by_at(vars(starts_with("Col")))
with group_by(Col1, Col2, Col3)
. 我们还可以将
group_by_at(vars(starts_with("Col")))
替换为group_by(Col1, Col2, Col3)
。 The count
function is to count the number of occurrence. count
功能是对发生的次数进行计数。 In this case, we need the sum
function with summarise
. 在这种情况下,我们需要的
sum
与功能summarise
。
library(dplyr)
df2 <- df %>%
group_by_at(vars(starts_with("Col"))) %>%
summarise(TOTAL_FREQUENCY = sum(FREQUENCY)) %>%
ungroup()
df2
# # A tibble: 5 x 4
# Col1 Col2 Col3 TOTAL_FREQUENCY
# <chr> <chr> <chr> <int>
# 1 A item1 CLASS1 12
# 2 A item2 CLASS2 10
# 3 B item2 CLASS1 8
# 4 B item3 CLASS3 2
# 5 C item4 CLASS2 16
DATA 数据
df <- read.table(text = "Col1 Col2 Col3 FREQUENCY INTERVAL
A item1 CLASS1 4 1
A item2 CLASS2 10 1
B item2 CLASS1 5 1
B item3 CLASS3 2 1
A item1 CLASS1 8 2
C item4 CLASS2 9 2
B item2 CLASS1 3 3
C item4 CLASS2 7 3",
header = TRUE, stringsAsFactors = FALSE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.