简体   繁体   English

在R中绘制相关矩阵

[英]plotting correlation matrix in R

I dont have much knowledge in R. I have a .txt file with a correlation matrix which was previously created from long records. 我对R的了解不多。我有一个带有相关矩阵的.txt文件,该文件以前是根据长记录创建的。

the text there in the file looks something like: 文件中的文本类似于:

"15075060" "15085030" "15085040"
"15075060" 1 0.441716695007761 0.433807683928689
"15085030" 0.441716695007761 1 0.477591938543259
"15085040" 0.433807683928689 0.477591938543259 1

This is a representative example because the real matrix is much bigger. 这是一个有代表性的示例,因为实数矩阵要大得多。 The numbers in the quotation marks are the sources that were correlated. 引号中的数字是相关的来源。 I read the data using read.table to create a data frame and then i convert it into a matrix (called matto) with: 我使用read.table读取数据以创建数据帧,然后使用以下命令将其转换为矩阵(称为matto):

mattox =matrix(as.numeric(unlist(matto)),nrow=nrow(matto))

and I obtain a matrix like this: 我得到这样的矩阵:

>mattox
          [,1]      [,2]      [,3]
[1,] 1.0000000 0.4417167 0.4338077
[2,] 0.4417167 1.0000000 0.4775919
[3,] 0.4338077 0.4775919 1.0000000

as an option 2, if I convert it into a matrix using: 作为选项2,如果我使用以下命令将其转换为矩阵:

as.matrix(sapply(matto, as.numeric))

then i obtain a matrix like this: 然后我得到一个像这样的矩阵:

> matto
         X.15075060 X.15085030 X.15085040
15075060  1.0000000  0.4417167  0.4338077
15085030  0.4417167  1.0000000  0.4775919
15085040  0.4338077  0.4775919  1.0000000

although I dont know why I get those X before the numbers at the column heads 虽然我不知道为什么我要在列标题的数字前得到那些X

when I try to plot this correlations using the function corrplot i obtain something like this for the matrix mattox: 当我尝试使用corrplot函数绘制此相关性时,我为矩阵mattox获得了类似的结果:

corrplot(mattox, type="upper")

在此处输入图片说明 but the problem is that i dont see here the head names of the columns and rows (numbers in quotation marks from the .txt file). 但是问题是我在这里看不到列和行的标题(.txt文件中带引号的数字)。 And for the other matrix (matto) i obtain an error when i try to use corrplot, the error says: 对于其他矩阵(matto),当我尝试使用Corrplot时出现错误,该错误表示:

Error in matrix(if (is.null(value)) logical() else value, nrow = nr, dimnames = list(rn,  : 
  length of 'dimnames' [2] not equal to array extent

I would like to obtain a graphic just like the one I obtained but with the names of columns and rows instead of numbers 1,2,3... something like the next graph, which I found online for othe case: 我想获得一个图形,就像我获得的图形一样,但是用列和行的名称代替数字1,2,3 ...类似于下一个图形,我在网上找到了其他情况:

在此处输入图片说明

how can I fix this? 我怎样才能解决这个问题?

You can skip those steps and just coerce it to a matrix when you read it, and should already be numeric. 您可以跳过这些步骤,并在阅读时将其强制转换为矩阵,并且应该已经是数字了。 It prepends the names with an x due to those names being duplicates . 由于名称重复,因此在名称前加上x You can specify colnames though. 您可以指定colnames

df <- as.matrix(read.table("location/of/text.txt", row.names = 1))
colnames(df) <- c("15075060", "15085030", "15085040")

str(df) # check the structure, it's numeric so we're good
num [1:3, 1:3] 1 0.442 0.434 0.442 1 ...
- attr(*, "dimnames")=List of 2
 ..$ : chr [1:3] "15075060" "15085030" "15085040"
 ..$ : chr [1:3] "15075060" "15085030" "15085040"

corrplot(df, type = "upper")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM