简体   繁体   English

如何在R中读取相关矩阵并形成散点图矩阵

[英]How to read a Correlation matrix and form a Scatterplot matrix in R

I have a correlation matrix in excel follows: 我在excel中有一个相关矩阵,如下所示:

dfA <- read.table(text=
      "beta1   beta2   beta3   beta4   beta5   beta6       X      X2      X3
beta1  1.0000 -0.2515 -0.2157  0.7209 -0.7205  0.4679  0.1025 -0.3606 -0.0356
beta2 -0.2515  1.0000  0.9831  0.1629 -0.1654 -0.5595 -0.0316  0.0946  0.0829
beta3 -0.2157  0.9831  1.0000  0.1529 -0.1559 -0.4976 -0.0266  0.0383  0.0738
beta4  0.7209  0.1629  0.1529  1.0000 -1.0000 -0.2753  0.0837 -0.1445  0.0080
beta5  0.4679 -0.5595 -0.4976 -0.2753  1.0000  0.2757  0.0354 -0.3149 -0.0596
beta6 -0.7205 -0.1654 -0.1559 -1.0000  0.2757  1.0000 -0.0837  0.1451 -0.0081
X      0.1025 -0.0316 -0.0266  0.0837 -0.0837  0.0354  1.0000  0.0278 -0.0875
X2    -0.3606  0.0946  0.0383 -0.1445  0.1451 -0.3149  0.0278  1.0000  0.2047
X3    -0.0356  0.0829  0.0738  0.0080 -0.0081 -0.0596 -0.0875  0.2047  1.0000", 
      header=TRUE) 

I have just the correlation matrix and not the original data from which the matrix is formed, so, I tried to read the this matrix into matrix in R with this code: 我只有相关矩阵,而没有形成矩阵的原始数据,因此,我尝试使用以下代码将此矩阵读入R中的矩阵:

 B <- as.matrix(dfA)

But when I try to form a scatter plot matrix with the following code: 但是,当我尝试使用以下代码形成散点图矩阵时:

library(corrplot)
corrplot(B, method="circle")

I receive error 我收到错误

Error in corrplot(B, method = "circle") : The matrix is not in [-1, 1]!

Kindly help me with this problem. 请帮助我解决这个问题。

corrplot() Solution corrplot()解决方案

Update to my first post using ggplot based on user20650's comments above. 根据上面的user20650的评论,使用ggplot更新到我的第一篇文章。 user20650 shows that the likely source of error was rounding mistakes leading to some numbers being out of the permissible [-1,1] range and that rounding solves this issue. user20650表明,可能的错误来源是四舍五入错误,导致某些数字超出了允许的[-1,1]范围,并且四舍五入解决了此问题。 I was able to produce a plot using corrplot() as well. 我也能够使用corrplot()生成图。

At this point, running corrplot() yields the following plot: 此时,运行corrplot()会产生以下图:

corMat<-as.matrix(dfA)

library('corrplot')
corrplot(corMat, method='circle')

在此处输入图片说明

ggplot() Solution ggplot()解决方案

You can also do this in ggplot2 with a few additional steps. 您还可以通过以下几个步骤在ggplot2中进行此操作。 I personally think it looks much better. 我个人认为它看起来要好得多。

1) I get rid of the redundant information in the lower triangle of the matrix. 1)我摆脱了矩阵下部三角形中的冗余信息。

corMat[lower.tri(corMat)]<-NA

> print(corMat)
      beta1   beta2   beta3  beta4   beta5   beta6       X      X2      X3
beta1     1 -0.2515 -0.2157 0.7209  0.4679 -0.7205  0.1025 -0.3606 -0.0356
beta2    NA  1.0000  0.9831 0.1629 -0.5595 -0.1654 -0.0316  0.0946  0.0829
beta3    NA      NA  1.0000 0.1529 -0.4976 -0.1559 -0.0266  0.0383  0.0738
beta4    NA      NA      NA 1.0000 -0.2753 -1.0000  0.0837 -0.1445  0.0080
beta5    NA      NA      NA     NA  1.0000  0.2757 -0.0837  0.1451 -0.0081
beta6    NA      NA      NA     NA      NA  1.0000  0.0354 -0.3149 -0.0596
X        NA      NA      NA     NA      NA      NA  1.0000  0.0278 -0.0875
X2       NA      NA      NA     NA      NA      NA      NA  1.0000  0.2047
X3       NA      NA      NA     NA      NA      NA      NA      NA  1.0000

2) Then I use reshape2::melt() to transform the matrix into long form and create a formatted version of values that only show up to two decimal places. 2)然后,我使用reshape2 :: melt()将矩阵转换为长格式,并创建值的格式版本,该值最多显示两位小数。 This will be useful for the plot. 这对情节很有用。

library(reshape2)
m<-melt(corMat)
m<-data.frame(m[!is.na(m[,3]),]) # get rid of the NA matrix entries
m$value_lab<-sprintf('%.2f',m$value)

Here's what the data looks like: 数据如下所示:

> head(m)
    Var1  Var2   value value_lab
1  beta1 beta1  1.0000      1.00
10 beta1 beta2 -0.2515     -0.25
11 beta2 beta2  1.0000      1.00
19 beta1 beta3 -0.2157     -0.22
20 beta2 beta3  0.9831      0.98
21 beta3 beta3  1.0000      1.00

3) Finally, I feed this data into ggplot2 - primarily relying on geom_tile() to print the matrix and geom_text() to print the labels over each tile. 3)最后,我将此数据输入ggplot2-主要依靠geom_tile()打印矩阵,并依靠geom_text()在每个图块上打印标签。 You can dress this up more if you want. 您可以根据需要打扮更多。

library(ggplot2)
ggplot(m, aes(Var2, Var1, fill = value, label=value_lab),color='blue') + 
  geom_tile() + 
  geom_text() +
  xlab('')+
  ylab('')+
  theme_minimal()

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM