简体   繁体   English

如何解释corrplot的输出?

[英]How do I interpret the output of corrplot?

The corrplot packages provides some neat plots and documents with examples. corrplot包提供了一些简洁的图和带有示例的文档。

But I don't understand the output. 但我不明白输出。 I can see that if you have a matrix A_ij , you can plot it as an arrangement of n by n square tiles, where the color of tile ij corresponds to the value of A_ij . 我可以看到,如果你有一个矩阵A_ij ,你可以将它绘制为n × n方块的排列,其中tile ij的颜色对应于A_ij的值。 But some examples appear to have more dimensions: 但是一些例子似乎有更多的维度:

在此输入图像描述

Here we can guess that color shows the correlation coefficient, and orientation of the ellipse is negative/positive correlation. 在这里我们可以猜测颜色显示相关系数,椭圆的方向是负/正相关。 What is the eccentricity? 什么是怪癖?

The documentation for method says: method 文档说:

the visualization method of correlation matrix to be used. 要使用的相关矩阵的可视化方法。 Currently, it supports seven methods, named "circle" (default), "square", "ellipse", "number", "pie", "shade" and "color". 目前,它支持七种方法,名为“圆”(默认),“方形”,“椭圆”,“数字”,“饼”,“阴影”和“颜色”。 See examples for details. 详见实例。

The areas of circles or squares show the absolute value of corresponding correlation coefficients. 圆形或正方形的区域表示相应的相关系数的绝对值。 Method "pie" and "shade" came from Michael Friendly's job (with some adjustment about the shade added on), and "ellipse" came from DJ Murdoch and ED Chow's job, see in section References. 方法“馅饼”和“阴影”来自Michael Friendly的工作(对阴影添加了一些调整),而“椭圆”来自DJ Murdoch和ED Chow的工作,参见参考文献部分。

So we know that the area, for circles and squares, should show the coefficient. 所以我们知道圆形和正方形的区域应该显示系数。 What about the other dimensions, and other methods? 其他维度和其他方法呢?

There is only one dimension shown by the plot. 图中只显示了一个维度。

Michael Friendly, in Corrgrams: Exploratory displays for correlation matrices (the corrplot documentation confusingly refers to this as his "job"), says: Corrgrams中迈克尔友好:相关矩阵的探索性显示corrplot文档令人困惑地将此称为“他的”工作“),说:

In the shaded row, each cell is shaded blue or red depending on the sign of the correlation, and with the intensity of color scaled 0–100% in proportion to the magnitude of the correlation. 在阴影行中,每个单元格根据相关性的符号以蓝色或红色阴影显示,并且颜色强度与相关性的大小成比例地缩放0-100%。 (Such scaled colors are easily computed using RGB coding from red, (1, 0, 0), through white (1, 1, 1), to blue (0, 0, 1). For simplicity, we ignore the non-linearities of color reproduction and perception, but note that these are easily accommodated in the color mapping function.) White diagonal lines are added so that the direction of the correlation may still be discerned in black and white. (使用从红色,(1,0,0),白色(1,1,1)到蓝色(0,0,1)的RGB编码可以很容易地计算出这种缩放颜色。为简单起见,我们忽略了非线性颜色再现和感知,但请注意,这些很容易适应颜色映射功能。)添加白色对角线,使得相关的方向仍然可以在黑白中辨别。 This bipolar scale of color was chosen to leave correlations near 0 empty (white), and to make positive and negative values of equal magnitude approximately equally intensely shaded. 选择这种双极性颜色标度以使相关性接近0空(白色),并使得相等幅度的正值和负值近似同样强烈地着色。 Gray scale and other color schemes are implemented in our software (Section 6), but not illustrated here. 我们的软件(第6节)中实现了灰度和其他颜色方案,但这里没有说明。

The bar and circular symbols also use the same scaled colors, but fill an area proportional to the absolute value of the correlation. 条形和圆形符号也使用相同的缩放颜色,但填充与相关的绝对值成比例的区域。 For the bars, negative values are filled from the bottom, positive values from the top. 对于条形图,负值从底部填充,正值从顶部填充。 The circles are filled clockwise for positive values, anti-clockwise for negative values. 圆圈顺时针填充正值,逆时针填充负值。 The ellipses have their eccentricity parametrically scaled to the correlation value (Murdoch and Chow, 1996). 椭圆的偏心率参数化地缩放到相关值(Murdoch和Chow,1996)。 Perceptually, they have the property of becoming visually less prominent as the magnitude of the correlation increases, in contrast to the other glyphs. 在感知上,与其他字形相比,随着相关幅度的增加,它们具有在视觉上不那么突出的特性。

(emphasis mine) (强调我的)

在此输入图像描述

"Murdoch and Chow, 1996" is a publication describing the equation for drawing the ellipses ( A Graphical Display of Large Correlation Matrices ). “Murdoch and Chow,1996”是描述绘制椭圆的方程式的出版物( 大相关矩阵的图形显示 )。 The ellipses are apparently meant to be caricatures of bivariate normal distributions: 省略号显然是双变量正态分布的漫画:

在此输入图像描述

So in conclusion, the only dimension shown is always the correlation coefficient (or the value of A_ij , to use the question's terminology) itself. 总而言之,所显示的唯一维度始终是相关系数(或A_ij的值,使用问题的术语)本身。 The multiple apparent dimensions are redundant. 多个表观尺寸是多余的。

I think the plot is quite self explanatory. 我认为情节是非常自我解释的。 On the right hand side you have the scale which is colored from red (negative correlation) to blue (positive correlation). 在右侧,您有从红色(负相关)到蓝色(正相关)着色的scale The color follows a gradient according to the strength of the correlation. 根据相关强度,颜色遵循梯度。

If the ellipse leans towards the right, it is again positive correlation and if it leans to the left, it is negative correlation. 如果椭圆向右倾斜,则它也是正相关,如果它向左倾斜,则它是负相关。

The diffusion around a line (which denotes perfect correlation, for example mpg ~ mpg) creates an ellipse. 围绕线的扩散(表示完全相关,例如mpg~mpg)产生椭圆。 You will have a more diffused ellipse for lower strengths of the correlation. 对于较低的相关强度,您将有一个更加扩散的椭圆。 This is typically how a weakly correlated relationship will look in a scatterplot. 这通常是一个弱相关关系在散点图中的样子。 These I think are caricatures, however. 然而,我认为这些是漫画。

Here is some code from the corrplot function responsible for drawing ellipses. 以下是负责绘制省略号的corrplot函数的一些代码。 I am not going to attempt to explain this (because it is a part of a larger system). 我不打算解释这个(因为它是更大系统的一部分)。 I wanted to show that the logic is all there if you'd like to deep dive into it: 我想表明,如果你想深入研究它,逻辑就在那里:

if (method == "ellipse" & plotCI == "n") {
    ell.dat <- function(rho, length = 99) {
        k <- seq(0, 2 * pi, length = length)
        x <- cos(k + acos(rho)/2)/2
        y <- cos(k - acos(rho)/2)/2
        return(cbind(rbind(x, y), c(NA, NA)))
    }
    ELL.dat <- lapply(DAT, ell.dat)
    ELL.dat2 <- 0.85 * matrix(unlist(ELL.dat), ncol = 2, 
        byrow = TRUE)
    ELL.dat2 <- ELL.dat2 + Pos[rep(1:length(DAT), each = 100), 
        ]
    polygon(ELL.dat2, border = col.border, col = col.fill)
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM