[英]Re-arrange rows of dataframe based on number of occurences of value in given column
Example first: 首先示例:
a <- cbind(1:10, c("a","b","a","b","b","d","a","b", "d", "c"))
a
[,1] [,2]
[1,] "1" "a"
[2,] "2" "b"
[3,] "3" "a"
[4,] "4" "b"
[5,] "5" "b"
[6,] "6" "d"
[7,] "7" "a"
[8,] "8" "b"
[9,] "9" "d"
[10,] "10" "c"
Here's what I need: I want the rows of this table be rearrenged such that those rows are at the top which have the most frequent 2nd-column value. 这就是我需要的:我希望重新整理此表中的行,以使那些行位于第二列值最频繁的顶部。 Ie the result I want is this:
即我想要的结果是这样的:
[,1] [,2]
[1,] "2" "b"
[2,] "4" "b"
[3,] "5" "b"
[4,] "8" "b"
[5,] "1" "a"
[6,] "3" "a"
[7,] "7" "a"
[8,] "6" "d"
[9,] "9" "d"
[10,] "10" "c"
I'm currently using a pretty ugly for
loop construction which basically runs through a sorted count(a, 2)
dataframe and then re-composes a new dataframe. 我目前正在使用一个非常丑陋
for
循环构造方法,该方法基本上是通过一个已排序的count(a, 2)
数据帧运行,然后重新组成一个新的数据帧。 Any ideas how to do this more neatly? 任何想法如何更整齐地做到这一点?
You could use ave
and order
. 您可以使用
ave
和order
。
Use ave
to calculate the length of each "group", and then order on that result. 使用
ave
计算每个“组”的长度,然后对该结果进行排序。 rank
might also be useful if you care about ties.... 如果您关心关系,
rank
也可能会有用。
> a[order(ave(a[, 2], a[, 2], FUN = length), decreasing = TRUE), ]
[,1] [,2]
[1,] "2" "b"
[2,] "4" "b"
[3,] "5" "b"
[4,] "8" "b"
[5,] "1" "a"
[6,] "3" "a"
[7,] "7" "a"
[8,] "6" "d"
[9,] "9" "d"
[10,] "10" "c"
The title refers data.frame
. 标题是
data.frame
。 Using data.table
and dplyr
使用
data.table
和dplyr
a1 <- as.data.frame(a)
library(data.table)
ans <- setDT(a1)[,N := .N, by = V2][order(-N)][, N := NULL]
# V1 V2
# 1: 2 b
# 2: 4 b
# 3: 5 b
# 4: 8 b
# 5: 1 a
# 6: 3 a
# 7: 7 a
# 8: 6 d
# 9: 9 d
# 10: 10 c
Or 要么
library(dplyr)
a1%>%
group_by(V2) %>%
mutate(L=n()) %>%
arrange(desc(L)) %>%
select(-L)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.