[英]Most common value (mode) by group in R
我需要为单个月份的a
列中的唯一名称找到最常见的值。 我知道该主题已经存在,我在这里找到了解决方案: 是否有用于查找模式的内置函数? ,但我对多模式有问题。
解决办法如下:
Mode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
df<-data.frame(a=rep(c("a","b"),each=5),b=c(2,1,2,2,3,3,1,1,2,3),
c = c("Feb","Feb","Jan","Jan","Mar","Mar","Jan","Jan","Feb","Feb"))
df %>% group_by(a,c) %>% summarise(d=Mode(b))
# A tibble: 6 x 3
# Groups: a [2]
a c d
<fct> <fct> <dbl>
1 a Feb 2
2 a Jan 2
3 a Mar 3
4 b Feb 2
5 b Jan 1
6 b Mar 3
#When I want use:
Modes <- function(x) {
ux <- unique(x)
tab <- tabulate(match(x, ux))
ux[tab == max(tab)]
}
df %>% group_by(a,c) %>% summarise(d=Modes(b))
#I get:
Error: Column `d` must be length 1 (a summary value), not 2
I expected:
1 a Feb 2
2 a Feb 1
3 a Jan 2
4 a Mar 3
5 b Feb 2
6 b Feb 3
7 b Jan 1
8 b Mar 3
你可以这样做:
library(dplyr)
df %>%
count(a, b, c) %>%
group_by(a, c) %>%
filter(n == max(n)) %>%
select(a, b, c)
解决方案:
# A tibble: 8 x 3
# Groups: a, c [6]
a b c
<fct> <dbl> <fct>
1 a 2 Feb
2 a 1 Feb
3 a 2 Jan
4 a 3 Mar
5 b 3 Mar
6 b 1 Jan
7 b 2 Feb
8 b 3 Feb
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.