根据给定列中值的出现次数重新排列数据框的行

Question

Example first: 首先示例：

a <- cbind(1:10, c("a","b","a","b","b","d","a","b", "d", "c"))
a
     [,1] [,2]
 [1,] "1"  "a" 
 [2,] "2"  "b" 
 [3,] "3"  "a" 
 [4,] "4"  "b" 
 [5,] "5"  "b" 
 [6,] "6"  "d" 
 [7,] "7"  "a" 
 [8,] "8"  "b" 
 [9,] "9"  "d" 
[10,] "10" "c"

Here's what I need: I want the rows of this table be rearrenged such that those rows are at the top which have the most frequent 2nd-column value. 这就是我需要的：我希望重新整理此表中的行，以使那些行位于第二列值最频繁的顶部。 Ie the result I want is this: 即我想要的结果是这样的：

     [,1] [,2]
 [1,] "2"  "b" 
 [2,] "4"  "b" 
 [3,] "5"  "b" 
 [4,] "8"  "b"
 [5,] "1"  "a" 
 [6,] "3"  "a" 
 [7,] "7"  "a"
 [8,] "6"  "d" 
 [9,] "9"  "d" 
[10,] "10" "c"

I'm currently using a pretty ugly for loop construction which basically runs through a sorted count(a, 2) dataframe and then re-composes a new dataframe. 我目前正在使用一个非常丑陋for循环构造方法，该方法基本上是通过一个已排序的count(a, 2)数据帧运行，然后重新组成一个新的数据帧。 Any ideas how to do this more neatly? 任何想法如何更整齐地做到这一点？

Answer 1

You could use ave and order . 您可以使用ave和order 。

Use ave to calculate the length of each "group", and then order on that result. 使用ave计算每个“组”的长度，然后对该结果进行排序。 rank might also be useful if you care about ties.... 如果您关心关系， rank也可能会有用。

> a[order(ave(a[, 2], a[, 2], FUN = length), decreasing = TRUE), ]
      [,1] [,2]
 [1,] "2"  "b" 
 [2,] "4"  "b" 
 [3,] "5"  "b" 
 [4,] "8"  "b" 
 [5,] "1"  "a" 
 [6,] "3"  "a" 
 [7,] "7"  "a" 
 [8,] "6"  "d" 
 [9,] "9"  "d" 
[10,] "10" "c"

Answer 2

The title refers data.frame . 标题是data.frame 。 Using data.table and dplyr 使用data.table和dplyr

a1 <- as.data.frame(a)
library(data.table)
ans <- setDT(a1)[,N := .N, by = V2][order(-N)][, N := NULL]
#       V1 V2
#    1:  2  b
#    2:  4  b
#    3:  5  b
#    4:  8  b
#    5:  1  a
#    6:  3  a
#    7:  7  a
#    8:  6  d
#    9:  9  d
#   10: 10  c

Or 要么

library(dplyr)
 a1%>% 
 group_by(V2) %>%
 mutate(L=n()) %>%
 arrange(desc(L)) %>%
 select(-L)

根据给定列中值的出现次数重新排列数据框的行

问题描述

2 个解决方案

解决方案1
2 已采纳 2014-07-26 16:11:32

解决方案2
2 2014-07-26 18:31:45

根据给定列中值的出现次数重新排列数据框的行

问题描述

2 个解决方案

解决方案1 2 已采纳 2014-07-26 16:11:32

解决方案2 2 2014-07-26 18:31:45

解决方案1
2 已采纳 2014-07-26 16:11:32

解决方案2
2 2014-07-26 18:31:45