简体   繁体   English

计算来自矩阵列表的样本的摘要统计量

[英]Computing summary statistics over samples from a list of matrices

I have a list of matrices with identical dimensions, for example: 我有一个具有相同维度的矩阵列表,例如:

mat.list=rep(list(matrix(rnorm(n=12,mean=1,sd=1), nrow = 3, ncol=4)),3)

What I'd like to do is to sample many times a random column from each matrix in the list, for example in a given sample the column indices to be sampled are: 我想做的是从列表中的每个矩阵对随机列进行多次采样,例如,在给定的采样中,要采样的列索引为:

set.seed(10) #for reproducibility
idx.vec = sample(1:ncol(mat.list[[1]]),length(mat.list))

And this function would return a matrix of the sampled columns: 并且此函数将返回采样列的矩阵:

sample.mat = mapply('[', mat.list, TRUE, idx.vec)

For each such sample matrix I'd like to compute the mean and variance of each row. 对于每个这样的样本矩阵,我想计算每行的均值和方差。 The result would therefore be a matrix for the means over the samples and a matrix for variances over the samples, such that the dimensions of these matrices will be the number of rows of the matrices in the list by the number samples. 因此,结果将是样本上均值的矩阵和样本上方差的矩阵,以使这些矩阵的维数将是列表中矩阵的行数乘以样本数。

What would be the most efficient (time and space) way to do this? 什么是最有效的方式(时间和空间)?

I would use replicate , rowMeans for the mean and rowSds from matrixStats : 我会用replicaterowMeans的均值和rowSdsmatrixStats

ll <- length(mat.list)
nn <- ncol(mat.list[[1]])

replicate(3,{
   idx.vec = sample(seq_len(nn),ll)
   sample.mat = mapply('[', mat.list, TRUE, idx.vec)
   list(mm = rowMeans(sample.mat),sd = rowSds(sample.mat))
},simplify=FALSE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM