简体   繁体   English

在R或Python中随机化矩阵的最有效方法

[英]Most efficient way to randomize a matrix in R or in Python

I'm working with a numeric matrix M in R which is quite big (11000 rows per 20 columns). 我正在使用一个很大的R中的数值矩阵M (每20列11000行)。 On this matrix, I'm performing a lot of correlation tests 在此矩阵上,我正在执行很多相关性测试

=> the function cor.test(M[i,], M[j,], method='spearman') where i and j are two rows from the matrix (all possible combinations are tested). =>函数cor.test(M[i,], M[j,], method='spearman') ,其中i和j是矩阵的两行(测试了所有可能的组合)。

The problem as you know is that I'm doing too many tests to get a very reliable p-value returned by this test. 如您所知,问题是我进行了太多测试,无法获得此测试返回的非常可靠的p值。

My strategy to overcome this limitation would be to generate a new probability distribution by Bootstrap on my matrix M: I would like to get 100 random matrices generated from M to do the multiple correlations on these matrices and choose the right cut-off for the p-value to get a FDR of 5%. 我克服这一限制的策略是通过Bootstrap在矩阵M上生成新的概率分布:我想从M生成100个随机矩阵,以对这些矩阵进行多重相关,并为p选择正确的截止值。值以获得5%的FDR。

My question is: 我的问题是:

  1. What is the most efficient way to randomize my matrix? 随机分配矩阵的最有效方法是什么?
  2. Since it's quite time consumming (I suppose) it could be interresting if the solution could be parallelized. 由于花费大量时间(我想),如果解决方案可以并行化,那可能会很麻烦。

Thank you in advance for all the usefull answers that you'll provide to me. 预先感谢您提供给我的所有有用的答案。

In python there is a function random.sample() in module random . 在python中,模块random有一个random.sample()函数。 If you store M as list of rows, randomly sampling n rows from matrix M without replacement would be like this 如果将M存储为行列表,则从矩阵M随机采样n行而不进行替换将像这样

M_sample = random.sample(M,n)

However, for bootstrapping, you might want to do random sampling with replacement. 但是,对于自举,您可能需要进行随机抽样和替换。 To do this, you can use numpy.random.choice() : 为此,您可以使用numpy.random.choice()

import numpy
M_sample = numpy.random.choice(M,n,replace=True)

In R, we use sample() to randomly decide the row indices to take, and then use row access to take the rows from the matrices. 在R中,我们使用sample()随机决定要采用的行索引,然后使用行访问从矩阵中获取行。 Randomly sampling n rows from matrix M without replacement is done as follows: 从矩阵M随机采样n行而不进行替换如下:

indices = sample(nrow(M), n,replace=FALSE)
M_sample = M[indices, ]

And for randomly sampling with replacement, replace the first line with this: 对于要替换的随机抽样,请用以下内容替换第一行:

indices = sample(nrow(M), n,replace=TRUE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM