简体   繁体   English

我如何计算两个矩阵的对应列之间的相关性,而不是像 output 那样获得其他相关性

[英]how do i calculate correlation between corresponding columns of two matrices and not getting other correlations as output

I have these data我有这些数据

> a
     a    b    c
1    1   -1    4
2    2   -2    6
3    3   -3    9
4    4   -4   12
5    5   -5    6

> b
     d    e    f
1    6   -5    7
2    7   -4    4
3    8   -3    3
4    9   -2    3
5   10   -1    9

> cor(a,b)
           d            e             f
a  1.0000000    1.0000000     0.1767767
b -1.0000000    -1.000000    -0.1767767
c  0.5050763    0.5050763    -0.6964286

The result I want is just:我想要的结果只是:

cor(a,d) = 1
cor(b,e) = -1
cor(c,f) = -0.6964286

The first answer above calculates all pairwise correlations, which is fine unless the matrices are large, and the second one doesn't work.上面的第一个答案计算所有成对相关性,除非矩阵很大,否则这很好,而第二个答案不起作用。 As far as I can tell, efficient computation must be done directly, such as this code borrowed from borrowed from the arrayMagic Bioconductor package, works efficiently for large matrices:据我所知,必须直接进行高效计算,例如从 arrayMagic Bioconductor package 借来的代码,对于大型矩阵有效:

> colCors = function(x, y) { 
+   sqr = function(x) x*x
+   if(!is.matrix(x)||!is.matrix(y)||any(dim(x)!=dim(y)))
+     stop("Please supply two matrices of equal size.")
+   x   = sweep(x, 2, colMeans(x))
+   y   = sweep(y, 2, colMeans(y))
+   cor = colSums(x*y) /  sqrt(colSums(sqr(x))*colSums(sqr(y)))
+   return(cor)
+ }

> set.seed(1)
> a=matrix(rnorm(15),nrow=5)
> b=matrix(rnorm(15),nrow=5)
> diag(cor(a,b))
[1]  0.2491625 -0.5313192  0.5594564
> mapply(cor,a,b)
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> colCors(a,b)
[1]  0.2491625 -0.5313192  0.5594564

I would probably personally just use diag :我个人可能只会使用diag

> diag(cor(a,b))
[1]  1.0000000 -1.0000000 -0.6964286

But you could also use mapply :但你也可以使用mapply

> mapply(cor,a,b)
         a          b          c 
 1.0000000 -1.0000000 -0.6964286

mapply works with data frames but not matrices. mapply适用于数据框,但不适用于矩阵。 That is because in data frames each column is an element, while in matrices each entry is an element.这是因为在数据帧中,每一列都是一个元素,而在矩阵中,每个条目都是一个元素。

In the answer above mapply(cor,as.data.frame(a),as.data.frame(b)) works just fine.在上面的答案中, mapply(cor,as.data.frame(a),as.data.frame(b))工作得很好。

set.seed(1)
a=matrix(rnorm(15),nrow=5)
b=matrix(rnorm(15),nrow=5)
diag(cor(a,b))
[1]  0.2491625 -0.5313192  0.5594564
mapply(cor,as.data.frame(a),as.data.frame(b))
    V1         V2         V3 
 0.2491625 -0.5313192  0.5594564 

This is much more efficient for large matrices.这对于大型矩阵更有效。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何计算样本之间两个矩阵之间的相关性? - How to calculate correlations across two matrices across samples? 计算 data.frame 列之间的相关性并将 output 分配给列表 - Calculate correlations between data.frame columns and assign output to list 如何将两个大矩阵乘以相应的列和行 - How can I multiply two large matrices by corresponding columns and rows 如何计算数据框列表中某些列之间的相关性? - How to calculate correlations between certain columns in a list of dataframes? 如何从相关表中提取相关性并制作具有最大相关性的对的 dataframe? - How do I extract correlations from correlation table and make a dataframe of pairs with the greatest correlations? R中列数不同的两个矩阵之间的相关性 - Correlation between two matrices with different amount of columns in R 如何计算rollapply中几列和一列之间的滚动相关性? - How to calculate rolling correlations between several columns and one column in rollapply? 如何计算R中具有不同列维的矩阵之间的相关性 - How to calculate correlation between matrices with different column dimention in R 来自不同数据集的两个对应列之间的相关性 - Correlation between two corresponding columns from seperate datasets 如何检查矩阵中的每个值是否在R中另外两个矩阵的相应值之间? - How to check if each value in a matrix is between the corresponding values in two other matrices in R?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM