简体   繁体   English

为什么melt (reshape2) 用列顺序号替换列名?

[英]Why does melt (reshape2) substitute column names by column order numbers?

I have a 74x74 pairwise distance matrix of SNP differences in which the first column and row correspond to the isolate's number, like this:我有一个 74x74 的 SNP 差异成对距离矩阵,其中第一列和第一行对应于分离株的编号,如下所示:

        26482RR 25638   26230   25689RR 25954
26482RR 0       8       0       6       0
25638   8       0       8       14      8
26230   0       8       0       6       0
25689RR 6       14      6       0       6
25954   0       8       0       6       0

M = structure(c(0L, 8L, 0L, 6L, 0L, 8L, 0L, 8L, 14L, 8L, 0L, 8L, 
0L, 6L, 0L, 6L, 14L, 6L, 0L, 6L, 0L, 8L, 0L, 6L, 0L), .Dim = c(5L, 
5L), .Dimnames = list(c("26482RR", "25638", "26230", "25689RR", 
"25954"), c("26482RR", "25638", "26230", "25689RR", "25954")))

I would like to convert this matrix into a table of SNP differences for each pair of isolates, like so:我想将此矩阵转换为每对分离株的 SNP 差异表,如下所示:

Col      Row    SNP differences
26482RR  25638   8
26482RR  26230   0
26482RR  25689RR 6
26482RR  25954   0
25638    26230   8
25638    25689RR 14
25638    25954   8
...

in order to plot this data and correlate it with other matrices.为了绘制此数据并将其与其他矩阵相关联。 I am a beginner in R so after a bit of searching I have decided to apply the following code:我是 R 的初学者,所以经过一番搜索后,我决定应用以下代码:

st1076 <- read.csv("st1076.csv", header=TRUE, sep=";")
m1 <- as.matrix(st1076)
m1 <- m1[upper.tri(m1)] <- NA
m1_melted <- reshape2:::melt.matrix(m1, na.rm = TRUE)
colnames(m1_melted) <- c("Col","Row","SNP differences")

However, with this code I get in "Col" the numeration of each isolate by its order of occurrence ( 1, 2, 3, 4...) and not is respective isolate number:但是,使用此代码,我在“Col”中按其出现顺序(1、2、3、4...)获得了每个分离株的编号,而不是各自的分离株编号:

Col     Row      SNP differences
2       X26482RR  8
3       X26482RR  0
4       X26482RR  6

From what I saw in other related questions, using melt.matrix should solve this problem but it didn't work for me.从我在其他相关问题中看到的,使用melt.matrix应该可以解决这个问题,但它对我不起作用。

Can anyone help me understand why this happened?谁能帮我理解为什么会这样? Do you have any suggestions in how to overcome it?您对如何克服它有什么建议吗?

I think your code was correct except reading from csv.除了从 csv 读取之外,我认为您的代码是正确的。 Because csvs are interpreted as data frames by read.csv , some processing is required to get a matrix:由于 csvs 被 read.csv 解释为数据帧, read.csv需要进行一些处理才能获得矩阵:

DF = read.csv("st1076.csv", sep=";", row.names=1, check.names=FALSE)
M = as.matrix(DF)

res <- reshape2::melt(replace(M, upper.tri(M), NA), 
  varnames = c("Col", "Row"), 
  value.name = "SNP differences", 
  na.rm = TRUE
)

head(res)
      Col     Row SNP differences
1 26482RR 26482RR               0
2   25638 26482RR               8
3   26230 26482RR               0
4 25689RR 26482RR               6
5   25954 26482RR               0
6   25692 26482RR               2

For reference, I started with this thread https://stat.ethz.ch/pipermail/r-help/2010-May/237835.html and then consulted the help file ?read.csv作为参考,我从这个线程开始https://stat.ethz.ch/pipermail/r-help/2010-May/237835.html然后查阅了帮助文件?read.csv

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM