简体   繁体   中英

How to add the spearman correlation p value along with correlation coefficient to ggpairs?

Constructing a ggpairs figure in R using the following code.

df is a dataframe containing 6 continuous variables and one Group variable

ggpairs(df[,-1],columns = 1:ncol(df[,-1]),
mapping=ggplot2::aes(colour = df$Group),legends = T,axisLabels = "show", 
upper = list(continuous = wrap("cor", method = "spearman", size = 2.5, hjust=0.7)))+ 
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"))

I am trying to add the p-value of spearman correlation to the upper panel of the figure generated (ie) appended to the Spearman correlation coefficient.

Generally, the p-value is computed using cor.test with method passed as "Spearman"

Also aware of the StackOverFlow post discussion a query similar to this, but I need for ggpairs , for which the solution is not working. Also, the previous query is not solved yet.

How to add p values for Spearman correlation coefficients plotted using pairs in R

I have a feeling this is more than what you expected.. so you need to define a custom function like ggally_cor , so first we have a function that prints the correlation between 2 variables:

printVar = function(x,y){
      vals = cor.test(x,y,
      method="spearman")[c("estimate","p.value")]
      names(vals) = c("rho","p")
      paste(names(vals),signif(unlist(vals),2),collapse="\n")
}

Then we define a function that takes in the data for each pair, and calculates 1. overall correlation, 2. correlation by group, and pass it into a ggplot and basically only print this text:

my_fn <- function(data, mapping, ...){
  # takes in x and y for each panel
  xData <- eval_data_col(data, mapping$x)
  yData <- eval_data_col(data, mapping$y)
  colorData <- eval_data_col(data, mapping$colour)

# if you have colors, split according to color group and calculate cor

  byGroup =by(data.frame(xData,yData),colorData,function(i)printVar(i[,1],i[,2]))
  byGroup = data.frame(col=names(byGroup),label=as.character(byGroup))
  byGroup$x = 0.5
  byGroup$y = seq(0.8-0.3,0.2,length.out=nrow(byGroup))

#main correlation
mainCor = printVar(xData,yData)

p <- ggplot(data = data, mapping = mapping) +
annotate(x=0.5,y=0.8,label=mainCor,geom="text",size=3) +
geom_text(data=byGroup,inherit.aes=FALSE,
aes(x=x,y=y,col=col,label=label),size=3)+ 
theme_void() + ylim(c(0,1))
  p
}

Now I use mtcars, first column is a random Group:

df  =data.frame(
Group=sample(LETTERS[1:2],nrow(mtcars),replace=TRUE),
mtcars[,1:6]
)

And plot:

ggpairs(df[,-1],columns = 1:ncol(df[,-1]),
mapping=ggplot2::aes(colour = df$Group),
axisLabels = "show", 
upper = list(continuous = my_fn))+
theme(panel.grid.major = element_blank(), 
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black")) 

在此处输入图像描述

I think for your own plot, the spacing of the text might not be optimal, but it's just a matter of tweaking my_fn .

Works well. But the signif rounding off probably is not good and is not working for p-value. Let me explain why? Signif will not round off the p-value less than 0.01 and will print the value as such (with 10th power represented as e). Suppose we use the round function, then also it is not good. Because, if the p-value is less than 0.001 it will come as 0 (with 2 places rounding off). Similarly, if the p-value is less than 0.01 it will come as 0 again (with 2 places rounding off).

So a mild modification of the code will take care of it.

printVar = function(x,y){
      vals = cor.test(x,y,
      method="spearman")[c("estimate","p.value")]

      vals[[1]]<-round(vals[[1]],2)   
      vals[[2]]<-ifelse(test = vals[[2]]<0.001,"<0.001",ifelse(test=vals[[2]]<0.01,"<0.01",round(vals[[2]],2)))

          names(vals) = c("rho","p")
      paste(names(vals),unlist(vals),collapse="\n")
}

And secondly, if we run the code as such it is giving an error that LAB is not found.

LAB is a character string required for the label.

You can either give character string. or just pass

LAB=c()

Not sure if it's because you have groups or using a different version of the package (I'm using GGally_2.1.1), but the following code works perfectly for me.

df %>% ggpairs(upper = list(continuous = wrap("cor", method = "spearman")))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM