简体   繁体   中英

create a distance matrix in R (without using dist())

I need to create a distance matrix from a matrix, which returns the distance between columns.

I KNOW that it exists a function called dist() but I cannot use it because I will use not common distance functions.

I was thinking about using apply, but I don't know how to write it.

The loop I have created is:

 dista <- function(A,distance){
  dist_matrix=matrix(0,dim(A)[2],dim(A)[2])
  for (i in 1:(dim(A)[2]-1)){
    for(j in (i+1):(dim(A)[2])){
      if(distance=='cosine') dist_matrix[j,i]<- (1-sum(A[,i]*A[,j]))/(sqrt(sum(A[,i]^2))+sqrt(sum(A[,j]^2)))
    }
  }
  dist_matrix
}

Assuming you have some data frame like this:

df <- data.frame(x = rnorm(10, 5, 1), y = rnorm(10))

You can use apply as follows:

apply(df, 1, dist)

To use a customer distance function, you can replace the call to dist above with:

apply(df, 1, my_own_dist)

Of course, this loops through each row of data, and will still be slower than a matrix based computation. Knowing what your distance function actually does might help folks get you an even more efficient way to approach the problem.

EDIT based on comment below....

If you are trying to compute pair-wise distance between every pair of columns in your original matrix A, you can try something like this:

apply(combn(1:ncol(A), 2), 2, function(x) my_dist_function(A[, x]))

First generate all unique column pairs, and run through them one at a time

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM