简体   繁体   中英

How to apply a function for Spearman's rank correlation coefficient in R?

I want to write a code for applying the fuction calculating the Spearman's rank correlation between combinations of column from a dataset. I have the following dataset:

library(openxlsx)
data <-read.xlsx("e:/LINGUISTICS/mydata.xlsx", 1);

A    B    C    D
go   see  get  eat
see  get  eat  go
get  go   go   get
eat  eat  see  see

The function cor(rank(x), rank(y), method = "spearman") measures correlation only between two columns, eg between A and B:

cor(rank(data$A), rank(data$B), method = "spearman")

But I need to calculate correlation between all possible combinations of columns (AB, AC, AD, BC, BD, CD). I wrote the following function for that:

wert <- function(x, y) { cor(rank(x), rank(y), method = "spearman") }

I do not know how to implement all possible combinations of columns (AB, AC, AD, BC, BD, CD) in my function in order to get all results automatically, because my real data has much more columns, and also as a matrix with correlation scores, eg as the following table:

    A     B     C     D
A   1     0.3   0.4   0.8
B   0.3   1     0.6   0.5
C   0.4   0.6   1     0.1
D   0.8   0.5   0.1   1

Can somebody help me?

You do not need rank . cor already calculates the Spearman rank correlation with method = "spearman" . If you want the correlation between all columns of a data.frame, just pass the data.frame to cor , ie cor(data, method = "spearman") . You should study help("cor") .

If you want to do this manually, use the combn function.

PS: Your additional challenge is that you actually have factor variables. A rank for an unordered factor is a strange concept, but R just uses collation order here. Since cor rightly expects numeric input, you should do data[] <- lapply(data, as.integer) first.

I think you can just make a function (pairedcolumns) that will then apply your function (spearman) to every pair of columns in the data frame you feed it.

#This function works on a data frame (x) usingwhichever other function (fun) you select by making all pairs of columns possible.
pairedcolumns <- function(x,fun) 
{
  n <- ncol(x)##find out how many columns are in the data frame

  foo <- matrix(0,n,n)
  for ( i in 1:n)
  {
    for (j in 1:n)
    {
      foo[i,j] <- fun(x[,i],x[,j])
}
}
 colnames(foo)<-rownames(foo)<-colnames(x)
return(foo)
}

 results<-pairedcolumns(yourdataframe[,2:8], function)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM