[英]R: Sort matrix based on amount of row values
I have a matrix with non numeric-values (missing values are blank, not Nan). 我有一个非数值的矩阵(缺少值是空白,而不是南)。
mat = read.table(textConnection(
" s1 s2 s3
g1 a;b a b
g2 b
g3 a a;b"), row.names = 1, header = TRUE, sep = "\t", stringsAsFactors = FALSE)
mat = as.matrix(mat)
What I want to do is to subset the matrix to select the rows with the two highest values in a row. 我想要做的是将矩阵子集化以选择一行中具有两个最高值的行。
So the result should be 所以结果应该是
g1 a;b a b # with three values
g3 a a;b # with two values
# g2 should be excluded because it only has one value
My approach would be 我的方法是
But I do not understand how to sort a matrix by the amount of entries. 但我不明白如何按条目数量对矩阵进行排序。
Any ideas? 有任何想法吗?
You can try something with the apply
by the row and check how many elements in the row is an empty string, then sort by the count. 您可以尝试使用行的apply
并检查行中有多少元素是空字符串,然后按计数排序。 So the sorted matrix would be like: 所以排序的矩阵就像:
mat[order(apply(mat, 1, function(row) sum(row != "")), decreasing = T), ]
s1 s2 s3
g1 "a;b" "a" "b"
g3 "a" "" "a;b"
g2 "" "" "b"
Say if the threshold is 2, you can also specify it in the function directly without sorting: 假如阈值为2,您也可以直接在函数中指定它而不进行排序:
mat[apply(mat, 1, function(row) sum(row != "") >= 2), ]
s1 s2 s3
g1 "a;b" "a" "b"
g3 "a" "" "a;b"
Another way as suggested by @alexis_laz is using rowSums
: @alexis_laz建议的另一种方法是使用rowSums
:
mat[rowSums(mat != "") >= 2, ]
s1 s2 s3
g1 "a;b" "a" "b"
g3 "a" "" "a;b"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.