简体   繁体   English

从R中的矩阵中提取最大值(随机选择)

[英]Extract max values from a matrix in R (random selection)

Given a matrix, extracting the row names of the column with max values is a common problem. 给定一个矩阵,提取具有最大值的列的行名是一个常见问题。

sapply(mat,2,which.max)

mat<-matrix(list(20,0,0,80,80,0,
                 20,0,40,0,40,20,
                 40,0,40,20,20,0,
                 0,80,40,20,20,20),ncol=6,byrow=T)
rownames(mat)<-c("A","C","G","T")

But here, some columns have two similar max values (in the example matrix, col 3 and 4). 但是在这里,有些列具有两个相似的最大值(在示例矩阵中,列3和4)。 By default the script chooses "A" has the row with the max column value in col 3 and 4. I am having trouble in writing a script to randomly select between two row names (A and T) wherein both have max values in column 3 and 4. Any help with the scripting is appreciated. 默认情况下,脚本选择“ A”在第3列和第4列中具有最大列值的行。我在编写脚本以在两个行名称(A和T)之间随机选择(其中两个列在第3列都具有最大值)时遇到麻烦和4.脚本编写方面的任何帮助都将受到赞赏。

The rank function comes in handy: rank函数派上用场:

> apply(mat,2,function(x) which(rank(-unlist(x), ties.method="random") == 1))
[1] 3 4 4 1 1 2
> apply(mat,2,function(x) which(rank(-unlist(x), ties.method="random") == 1))
[1] 3 4 3 1 1 2
> apply(mat,2,function(x) which(rank(-unlist(x), ties.method="random") == 1))
[1] 3 4 4 1 1 4

The ties.method="random" part is crucial for resolving the ties in a random fashion. ties.method="random"部分对于以随机方式解决关系至关重要。

Consider reading the documentation for which.max , which suggests using which.is.max from nnet . 考虑阅读文档which.max ,该文档建议使用nnet中的 which.is.max Either borrow that algorithm or use that package. 借用该算法或使用该程序包。

> library(nnet)
> which.is.max
function (x) 
{
    y <- seq_along(x)[x == max(x)]
    if (length(y) > 1L) 
        sample(y, 1L)
    else y
}
<bytecode: 0x0000000013fda7c8>
<environment: namespace:nnet>

You could sample from those rownames which have values equal to the max value in that column: 你可以sample来自那些rownames具有相等的值max在该列中的值:

mat<-matrix(c(20,0,0,80,80,0,
                 20,0,40,0,40,20,
                 40,0,40,20,20,0,
                 0,80,40,20,20,20),ncol=6,byrow=T)
rownames(mat)<-c("A","C","G","T")

set.seed(123)
apply( mat, 2 , function(x) sample( c( rownames(mat)[ which( x == max(x) ) ] ) , 1 ) )
#[1] "G" "T" "G" "A" "A" "C"

set.seed(1234)
apply( mat, 2 , function(x) sample( c( rownames(mat)[ which( x == max(x) ) ] ) , 1 ) )
#[1] "G" "T" "G" "A" "A" "T"

ps I'm not sure why you construct the matrix data usin a list object - matrices are vectors. ps我不确定为什么要在list对象中构造矩阵数据-矩阵是向量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM