簡體   English   中英

將一個矩陣的內容映射到另一矩陣的結構上

[英]Mapping content of one matrix onto structure of another matrix

我有兩個來自同一數據集的矩陣,但每個矩陣都有不同的數據量。 我想創建一個數據集,該數據集在列名和行名方面是x的副本,但其中包含y中的數據值。 如果數據不可用,則將NA用作該坐標的值。

並非x中的所有行名都出現在y ,反之亦然。 列名也是如此。

對於下面給出的示例輸入數據, xy中相對應的行名是|開頭和結尾處的行名| (對於其他映射,我想保留|之后的所有內容)。

最有效的方法是什么?

期望的輸出

z = structure(c(NA, 1, NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, NA, 
NA, 0, NA, NA, NA, 0, NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), .Dim = c(11L, 5L), .Dimnames = list(
c("AACSL|729522", "AACS|65985", "AADACL2|344752", "AADACL3|126767", 
"AADACL4|343066", "AADAC|13", "AADAT|51166", "AAGAB|79719", 
"AAK1|22848", "AAK12|14", "AANAT|15"), c("S18", "S20", "S45", 
"S95", "S100")))

示例輸入

x = structure(c(0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 
1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 
0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0), .Dim = c(11L, 
5L), .Dimnames = list(c("AACSL|729522", "AACS|65985", "AADACL2|344752", 
"AADACL3|126767", "AADACL4|343066", "AADAC|13", "AADAT|51166", 
"AAGAB|79719", "AAK1|22848", "AAK12|14", "AANAT|15"), c("S18", 
"S20", "S45", "S95", "S100")))

y = structure(c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
0, 0, 0, 0, 0, 0), .Dim = c(11L, 4L), .Dimnames = list(c("A1BG", 
"A1CF", "A2ML1", "A4GALT", "AACS", "AAK1", "AARD", "AARS2", "AASDHPPT", 
"AASS", "BAACS"), c("S18", "S10", "S45", "S95")))

我認為您提供的示例可能有一個小問題,我看不到z是如何從上面的x和y傳來的。請參見以下代碼:

intersect(sapply(rownames(x), #I am just extracting the letter codes here
             function(i){
                     return(
                             strsplit(x=i,split="|",fixed=TRUE)[[1]][[1]])
             }),rownames(y))

#[1] "AACS" "AAK1"

很奇怪吧? 我的意思是,與x相比,y中只有2個代碼。 但是,我認為下面的代碼可以實現您的計划(這種不一致的情況除外):

library(data.table)
library(reshape2)
library(dplyr)
x %>% as.data.frame %>% mutate(rownames=rownames(x)) %>%
        mutate(nms=sapply(rownames(x),
                          function(i){
                                  return(
                                          strsplit(x=i,split="|",fixed=TRUE)[[1]][[1]])
                          })) %>%
        melt(id.vars=c("nms","rownames")) %>%
        merge(., y %>% as.data.frame %>% mutate(nms=rownames(y))%>% melt(id.vars="nms"), by=c("variable","nms"), all.x=TRUE) %>%
        select(-nms, -value.x) %>% dcast(formula = rownames~variable, value.var="value.y") -> xy
#now put back the column names where they belong
rownames(xy)<-xy$rownames
#now the only thing left is to arrange the columns
xy[rownames(x),colnames(x)] -> xy

還是我理解您的某些觀點是錯誤的?

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM