簡體   English   中英

根據給定列中值的出現次數重新排列數據框的行

[英]Re-arrange rows of dataframe based on number of occurences of value in given column

首先示例:

a <- cbind(1:10, c("a","b","a","b","b","d","a","b", "d", "c"))
a
     [,1] [,2]
 [1,] "1"  "a" 
 [2,] "2"  "b" 
 [3,] "3"  "a" 
 [4,] "4"  "b" 
 [5,] "5"  "b" 
 [6,] "6"  "d" 
 [7,] "7"  "a" 
 [8,] "8"  "b" 
 [9,] "9"  "d" 
[10,] "10" "c" 

這就是我需要的:我希望重新整理此表中的行,以使那些行位於第二列值最頻繁的頂部。 即我想要的結果是這樣的:

     [,1] [,2]
 [1,] "2"  "b" 
 [2,] "4"  "b" 
 [3,] "5"  "b" 
 [4,] "8"  "b"
 [5,] "1"  "a" 
 [6,] "3"  "a" 
 [7,] "7"  "a"
 [8,] "6"  "d" 
 [9,] "9"  "d" 
[10,] "10" "c"

我目前正在使用一個非常丑陋for循環構造方法,該方法基本上是通過一個已排序的count(a, 2)數據幀運行,然后重新組成一個新的數據幀。 任何想法如何更整齊地做到這一點?

您可以使用aveorder

使用ave計算每個“組”的長度,然后對該結果進行排序。 如果您關心關系, rank也可能會有用。

> a[order(ave(a[, 2], a[, 2], FUN = length), decreasing = TRUE), ]
      [,1] [,2]
 [1,] "2"  "b" 
 [2,] "4"  "b" 
 [3,] "5"  "b" 
 [4,] "8"  "b" 
 [5,] "1"  "a" 
 [6,] "3"  "a" 
 [7,] "7"  "a" 
 [8,] "6"  "d" 
 [9,] "9"  "d" 
[10,] "10" "c"

標題是data.frame 使用data.tabledplyr

a1 <- as.data.frame(a)
library(data.table)
ans <- setDT(a1)[,N := .N, by = V2][order(-N)][, N := NULL]
#       V1 V2
#    1:  2  b
#    2:  4  b
#    3:  5  b
#    4:  8  b
#    5:  1  a
#    6:  3  a
#    7:  7  a
#    8:  6  d
#    9:  9  d
#   10: 10  c

要么

library(dplyr)
 a1%>% 
 group_by(V2) %>%
 mutate(L=n()) %>%
 arrange(desc(L)) %>%
 select(-L)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM