简体   繁体   English

R:基于行值的数量对矩阵进行排序

[英]R: Sort matrix based on amount of row values

I have a matrix with non numeric-values (missing values are blank, not Nan). 我有一个非数值的矩阵(缺少值是空白,而不是南)。

mat = read.table(textConnection(
"   s1  s2  s3
g1  a;b  a  b
g2       b   
g3  a       a;b"), row.names = 1, header = TRUE, sep = "\t", stringsAsFactors = FALSE)
mat = as.matrix(mat)

What I want to do is to subset the matrix to select the rows with the two highest values in a row. 我想要做的是将矩阵子集化以选择一行中具有两个最高值的行。

So the result should be 所以结果应该是

g1  a;b  a  b # with three values
g3  a       a;b # with two values
# g2 should be excluded because it only has one value

My approach would be 我的方法是

  • sort matrix by amount of values 按值的数量排序矩阵
  • subset sorted matrix 子集排序矩阵

But I do not understand how to sort a matrix by the amount of entries. 但我不明白如何按条目数量对矩阵进行排序。

Any ideas? 有任何想法吗?

You can try something with the apply by the row and check how many elements in the row is an empty string, then sort by the count. 您可以尝试使用行的apply并检查行中有多少元素是空字符串,然后按计数排序。 So the sorted matrix would be like: 所以排序的矩阵就像:

mat[order(apply(mat, 1, function(row) sum(row != "")), decreasing = T), ]
   s1    s2  s3   
g1 "a;b" "a" "b"  
g3 "a"   ""  "a;b"
g2 ""    ""  "b"  

Say if the threshold is 2, you can also specify it in the function directly without sorting: 假如阈值为2,您也可以直接在函数中指定它而不进行排序:

mat[apply(mat, 1, function(row) sum(row != "") >= 2), ]
   s1    s2  s3   
g1 "a;b" "a" "b"  
g3 "a"   ""  "a;b"

Another way as suggested by @alexis_laz is using rowSums : @alexis_laz建议的另一种方法是使用rowSums

mat[rowSums(mat != "") >= 2, ]
   s1    s2  s3   
g1 "a;b" "a" "b"  
g3 "a"   ""  "a;b"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM