简体   繁体   English

ggpairs 图与相关值的热图

[英]ggpairs plot with heatmap of correlation values

My question is twofold;我的问题是双重的;

I have a ggpairs plot with the default upper = list(continuous = cor) and I would like to colour the tiles by correlation values (exactly like what ggcorr does).我有一个带有默认upper = list(continuous = cor)的 ggpairs 图,我想通过相关值为图块着色(就像 ggcorr 所做的那样)。

I have this:我有这个:ggpairs 每日流量图
I would like the correlation values of the plot above to be coloured like this:我希望上面图的相关值是这样着色的:ggcorr 相关值的热图

library(GGally)

sample_df <- data.frame(replicate(7,sample(0:5000,100)))
colnames(sample_df) <- c("KUM", "MHP", "WEB", "OSH", "JAC", "WSW", "gaugings")

ggpairs(sample_df, lower = list(continuous = "smooth"))  
ggcorr(sample_df, label = TRUE, label_round = 2)

I had a brief go at trying to use upper = list(continuous = wrap(ggcorr) but didn't have any luck and, given that both functions return plot calls, I don't think that's the right path?我曾尝试使用upper = list(continuous = wrap(ggcorr)但没有任何运气,并且鉴于这两个函数都返回绘图调用,我认为那不是正确的路径?

I am aware that I could build this in ggplot (eg Sandy Muspratt's solution ) but given that the GGally package already has the functionality I am looking for I thought I might be overlooking something.我知道我可以在 ggplot 中构建它(例如Sandy Muspratt 的解决方案),但鉴于 GGally 包已经具有我正在寻找的功能,我想我可能会忽略一些东西。


More broadly, I would like to know how we, or if we can, call the correlation values?更广泛地说,我想知道我们如何,或者如果可以的话,如何称呼相关值? A simpler option may be to colour the labels rather than the tile (ie this question using colour rather than size) but I need a variable to assign to colour...一个更简单的选择可能是给标签而不是平铺上色(即这个问题使用颜色而不是大小)但我需要一个变量来分配颜色......

Being able to call the correlation values to use in other plots would be handy although I suppose I could just recalculate them myself.能够调用相关值以在其他图中使用会很方便,尽管我想我可以自己重新计算它们。

Thank you!谢谢!

A possible solution is to get the list of colors from the ggcorr correlation matrix plot and to set these colors as background in the upper tiles of the ggpairs matrix of plots.一个可能的解决方案是从ggcorr相关矩阵图中获取颜色列表,并将这些颜色设置为ggpairs矩阵图的上部图块中的背景。

library(GGally)   
library(mvtnorm)
# Generate data
set.seed(1)
n <- 100
p <- 7
A <- matrix(runif(p^2)*2-1, ncol=p) 
Sigma <- cov2cor(t(A) %*% A)
sample_df <- data.frame(rmvnorm(n, mean=rep(0,p), sigma=Sigma))
colnames(sample_df) <- c("KUM", "MHP", "WEB", "OSH", "JAC", "WSW", "gaugings")

# Matrix of plots
p1 <- ggpairs(sample_df, lower = list(continuous = "smooth"))  
# Correlation matrix plot
p2 <- ggcorr(sample_df, label = TRUE, label_round = 2)

The correlation matrix plot is:相关矩阵图为:

在此处输入图片说明

# Get list of colors from the correlation matrix plot
library(ggplot2)
g2 <- ggplotGrob(p2)
colors <- g2$grobs[[6]]$children[[3]]$gp$fill

# Change background color to tiles in the upper triangular matrix of plots 
idx <- 1
for (k1 in 1:(p-1)) {
  for (k2 in (k1+1):p) {
    plt <- getPlot(p1,k1,k2) +
     theme(panel.background = element_rect(fill = colors[idx], color="white"),
           panel.grid.major = element_line(color=colors[idx]))
    p1 <- putPlot(p1,plt,k1,k2)
    idx <- idx+1
}
}
print(p1)

在此处输入图片说明

You can map a background colour to the cell by writing a quick custom function that can be passed directly to ggpairs .您可以通过编写可以直接传递给ggpairs的快速自定义函数将背景颜色映射到单元格。 This involves calculating the correlation between the pairs of variables, and then matching to some user specified colour range.这涉及计算变量对之间的相关性,然后匹配某些用户指定的颜色范围。

my_fn <- function(data, mapping, method="p", use="pairwise", ...){

              # grab data
              x <- eval_data_col(data, mapping$x)
              y <- eval_data_col(data, mapping$y)

              # calculate correlation
              corr <- cor(x, y, method=method, use=use)

              # calculate colour based on correlation value
              # Here I have set a correlation of minus one to blue, 
              # zero to white, and one to red 
              # Change this to suit: possibly extend to add as an argument of `my_fn`
              colFn <- colorRampPalette(c("blue", "white", "red"), interpolate ='spline')
              fill <- colFn(100)[findInterval(corr, seq(-1, 1, length=100))]

              ggally_cor(data = data, mapping = mapping, ...) + 
                theme_void() +
                theme(panel.background = element_rect(fill=fill))
            }

Using the data in Marco's answer:使用 Marco 回答中的数据:

library(GGally)    # version: ‘1.4.0’

p1 <- ggpairs(sample_df, 
                   upper = list(continuous = my_fn),
                   lower = list(continuous = "smooth"))  

Which gives:这给出了:

在此处输入图片说明


A followup question Change axis labels of a modified ggpairs plot (heatmap of correlation) noted that post plot updating of the theme resulted in the panel.background colours being removed.后续问题更改修改后的 ggpairs 图(相关热图)的轴标签指出, theme后期图更新导致panel.background颜色被删除。 This can be fixed by removing the theme_void and removing the grid lines within the theme.这可以通过删除theme_void并删除主题中的网格线来解决。 ie change the relevant bit to ( NOTE that this fix is not required for ggplot2 v3.3.0)即将相关位更改为(请注意,ggplot2 v3.3.0 不需要此修复程序)

ggally_cor(data = data, mapping = mapping, ...) + 
           theme(panel.background = element_rect(fill=fill, colour=NA),
                 panel.grid.major = element_blank()) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM