简体   繁体   中英

Create a Correlation Matrix From a Correlation Vector in R

I want to create a correlation matrix given the correlation vector, which is the upper (or lower) triangular matrix of the correlation matrix.

The goal is to transform this vector

在此输入图像描述

to this correlation matrix with 1s on the diagonal.

在此输入图像描述

Do you know if there is a method creating a matrix given the triangular above the diagonal and to set the diagonal to 1?

I don't know if there is an automatic way to do this, but expanding on my comment:

myvec <- c(-.55, -.48, .66, .47, -.38, -.46)
mempty <- matrix(0, nrow = 4, ncol = 4)
mindex <- matrix(1:16, nrow = 4, ncol = 4)
mempty[mindex[upper.tri(mindex)]] <- myvec
mempty[lower.tri(mempty)] <- t(mempty)[lower.tri(t(mempty))]
diag(mempty) <- 1
mempty
#       [,1]  [,2]  [,3]  [,4]
# [1,]  1.00 -0.55 -0.48  0.47
# [2,] -0.55  1.00  0.66 -0.38
# [3,] -0.48  0.66  1.00 -0.46
# [4,]  0.47 -0.38 -0.46  1.00

Here's a quickly hacked together function. I hope all my mathematics steps are correct!

vec2symmat <- function(invec, diag = 1, byrow = TRUE) {
  Nrow <- ceiling(sqrt(2*length(invec)))

  if (!sqrt(length(invec)*2 + Nrow) %% 1 == 0) {
    stop("invec is wrong length to create a square symmetrical matrix")
  }

  mempty <- matrix(0, nrow = Nrow, ncol = Nrow)
  mindex <- matrix(sequence(Nrow^2), nrow = Nrow, ncol = Nrow, byrow = byrow)
  if (isTRUE(byrow)) {
    mempty[mindex[lower.tri(mindex)]] <- invec
    mempty[lower.tri(mempty)] <- t(mempty)[lower.tri(t(mempty))]
  } else {
    mempty[mindex[upper.tri(mindex)]] <- invec
    mempty[lower.tri(mempty)] <- t(mempty)[lower.tri(t(mempty))]
  }

  diag(mempty) <- diag
  mempty
}

Here it is with a different value for the diagonal.

vec2symmat(1:3, diag = NA)
#      [,1] [,2] [,3]
# [1,]   NA    1    2
# [2,]    1   NA    3
# [3,]    2    3   NA

Here's an error message if you try to provide data that can't create a square matrix.

vec2symmat(1:4)
# Error in vec2symmat(1:4) : 
#   invec is wrong length to create a square symmetrical matrix

And, with default settings.

vec2symmat(1:10)
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    1    1    2    3    4
# [2,]    1    1    5    6    7
# [3,]    2    5    1    8    9
# [4,]    3    6    8    1   10
# [5,]    4    7    9   10    1

You can convince R that your vector is a distance object then use as.matrix to convert it:

> myvec <- c(-.55, -.48, .66, .47, -.38, -.46)
> class(myvec) <- 'dist'
> attr(myvec,'Size') <- 4
> as.matrix(myvec)
      1     2     3     4
1  0.00 -0.55 -0.48  0.66
2 -0.55  0.00  0.47 -0.38
3 -0.48  0.47  0.00 -0.46
4  0.66 -0.38 -0.46  0.00

Or a variation on the answer by @AnandaMahto (similar to the internals used above):

> myvec <- c(-.55, -.48, .66, .47, -.38, -.46)
> mycor <- matrix(0,4,4)
> mycor[ col(mycor) < row(mycor) ] <- myvec
> mycor <- mycor + t(mycor)
> diag(mycor) <- 1
> mycor
      [,1]  [,2]  [,3]  [,4]
[1,]  1.00 -0.55 -0.48  0.66
[2,] -0.55  1.00  0.47 -0.38
[3,] -0.48  0.47  1.00 -0.46
[4,]  0.66 -0.38 -0.46  1.00

An answer with some helper functions that may be useful in other problems:

`lower.tri<-` <- function(x,value){
    x[lower.tri(x)] <- value
    x
}

`upper.tri<-` <- function(x,value){
    y <- t(x)
    lower.tri(y) <- value
    t(y)
}

vec2mat <- function(r){
    n <- (1+sqrt(1+8*length(r)))/2
    x <- diag(1,n)
    lower.tri(x) <- upper.tri(x) <- r
    x
}

EDIT: note that upper.tri<- is not simply obtained by replacing "lower" with "upper" in lower.tri<- . That would make the result unsymmetrical.

Result:

vec2mat(c(-0.55, -0.48, 0.66, 0.47, -0.38, -0.46))

      [,1]  [,2]  [,3]  [,4]
[1,]  1.00 -0.55 -0.48  0.66
[2,] -0.55  1.00  0.47 -0.38
[3,] -0.48  0.47  1.00 -0.46
[4,]  0.66 -0.38 -0.46  1.00

In any programming language I think you would just do it with a couple of nested for loops like this:

Given:

Vector R; // I'll use round brackets R(3) to mean the one's based 3rd element. // (which is stored in the memory location R[2] in languages with zero based arrays) int N;

Matrix M=Matrix(N,N); // a new instance of your matrix object, or you can just use arrays.

int i,j,k;

k=1;
for(i=1;i<N;i++)
{
   M(i,i)=1;
   for(j=i+1,j<=N;j++)
   { 
      M(i,j)=M(j,i)=R[k];
      k=k+1;
   }
}

Here I assumed that you know what N is and that you have basic objects like vectors and matrices available. (If not, they are a great sample problem for writing your first 'objects') Complex data structures like vectors, matricies, complex numbers and histograms all make ideal objects. The right way to think about object oriented programing for scientific work is that you use objects to teach your compiler to understand the high level data types that you want to use in your real work... The objects are used to create a custom programing language ideally suited to your type of work. Anything that is generically useful should go into the object, since those objects will grow and evolve to be your reusable code base.

The top level code can then be either a very powerful, easy to read and clean application (since so much of the detail work is done in the objects) Alternatively, for quick and dirty coding, the top level code is where you put all the brittle hacks. Since it's not intended to be reusable.

Once you got something like this debugged, you would just make a matrix constructor that takes the correlation vector and N as arguments, and initializes the matric for you.

Of course if you are using some high level graphical math program which has strong opinions of what you can & can't do with matricies and vectors, then you'd have to multiply the vector by N matricies to generate each of the column vectors of the final matric. (or else read the manual)

At the very least, you'll have to tell us what the math program is called... :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM