简体   繁体   English

从相关矩阵中找到具有良好相关性的基因

[英]Find genes with good correlation from a correlation matrix

I have matrix file which is basically a spearman correlation matrix between genes across various cell type. 我有一个矩阵文件,它基本上是跨各种细胞类型的基因之间的Spearman相关矩阵。 So now Im trying to find out which set of genes or group of genes whose correlation value is lets say greater than 0.6 if I set that as my threshold. 因此,现在我试图找出相关值等于或大于0.6的一组基因或一组基因(如果我将其设置为阈值)。 How can I do that? 我怎样才能做到这一点? I'm posting a subset of my data. 我正在发布数据的一部分。 It's a 502 x 502 matrix. 这是502 x 502的矩阵。

        ACTL6B   ACTR5   ACTR6
ACTL6B  1        0.6        -0.4
ACTR5   0.4        1        -0.3
ACTR6  -0.4      -0.3         1

So I don't want correlation between same set of genes which would be 1. I want another comparison. 因此,我不希望同一组基因之间的相关性为1。我想要另一个比较。 Like, lets say, ACTL6B and ACTR5 whose correlation is 0.6. 像,让说, ACTL6BACTR5之间的相关度为0.6。 I would like to keep those values and genes. 我想保留这些价值观和基因。

Here is an example: 这是一个例子:

mat <- cor(longley)  # example 7 x 7 correlation matrix

# Find indices of correlations greater than 0.6
idx <- which(mat > 0.6 & lower.tri(mat), arr.ind = TRUE)

# names of the resulting variables
cbind(rownames(idx), colnames(mat)[idx[, 2]])

Due to lower.tri all values on the diagonal and in the upper matrix are ignored. 由于lower.tri ,对角线和较高矩阵中的所有值都将被忽略。

The result: 结果:

      [,1]         [,2]          
 [1,] "GNP"        "GNP.deflator"
 [2,] "Unemployed" "GNP.deflator"
 [3,] "Population" "GNP.deflator"
 [4,] "Year"       "GNP.deflator"
 [5,] "Employed"   "GNP.deflator"
 [6,] "Unemployed" "GNP"         
 [7,] "Population" "GNP"         
 [8,] "Year"       "GNP"         
 [9,] "Employed"   "GNP"         
[10,] "Population" "Unemployed"  
[11,] "Year"       "Unemployed"  
[12,] "Year"       "Population"  
[13,] "Employed"   "Population"  
[14,] "Employed"   "Year"    

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM