简体   繁体   中英

Find genes with good correlation from a correlation matrix

I have matrix file which is basically a spearman correlation matrix between genes across various cell type. So now Im trying to find out which set of genes or group of genes whose correlation value is lets say greater than 0.6 if I set that as my threshold. How can I do that? I'm posting a subset of my data. It's a 502 x 502 matrix.

        ACTL6B   ACTR5   ACTR6
ACTL6B  1        0.6        -0.4
ACTR5   0.4        1        -0.3
ACTR6  -0.4      -0.3         1

So I don't want correlation between same set of genes which would be 1. I want another comparison. Like, lets say, ACTL6B and ACTR5 whose correlation is 0.6. I would like to keep those values and genes.

Here is an example:

mat <- cor(longley)  # example 7 x 7 correlation matrix

# Find indices of correlations greater than 0.6
idx <- which(mat > 0.6 & lower.tri(mat), arr.ind = TRUE)

# names of the resulting variables
cbind(rownames(idx), colnames(mat)[idx[, 2]])

Due to lower.tri all values on the diagonal and in the upper matrix are ignored.

The result:

      [,1]         [,2]          
 [1,] "GNP"        "GNP.deflator"
 [2,] "Unemployed" "GNP.deflator"
 [3,] "Population" "GNP.deflator"
 [4,] "Year"       "GNP.deflator"
 [5,] "Employed"   "GNP.deflator"
 [6,] "Unemployed" "GNP"         
 [7,] "Population" "GNP"         
 [8,] "Year"       "GNP"         
 [9,] "Employed"   "GNP"         
[10,] "Population" "Unemployed"  
[11,] "Year"       "Unemployed"  
[12,] "Year"       "Population"  
[13,] "Employed"   "Population"  
[14,] "Employed"   "Year"    

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM