简体   繁体   English

R:在两个矩阵中合并公共行

[英]R:combining common rows in two matrices

I have two matrices. 我有两个矩阵。 I would like to combine them such that any element which is in Matrix1 but not in Matrix2 (scenario1) is added to the end of Matrix2. 我想将它们组合起来,以便将Matrix1中但不在Matrix2中的任何元素(场景1)添加到Matrix2的末尾。 However, if an element is in both Matrix1 and Matrix2 (scenario2), then I would like to overwrite certain columns in that row of Matrix2 with the columns for the corresponding row of Matrix1. 但是,如果一个元素同时在Matrix1和Matrix2中(方案2),那么我想用Matrix1的相应行的列覆盖Matrix2的该行中的某些列。

I have taken a look at ddply and merge, which seem to satisfy scenario1, but I can't solve the problem regarding scenario2. 我看了看ddply和merge,它们似乎满足了方案1,但是我无法解决有关方案2的问题。

An example: 一个例子:
The original matrices: Matrix2 原始矩阵:Matrix2

 Col1 Col2 Col3 Col4 ABC 100 200 900 DEF 300 400 1000 

Matrix1 矩阵1

 Col1 Col2 Col3 HIJ 500 600 ABC 700 800 KLM 1100 1200 

The new Matrix2: 新的Matrix2:

 Col1 Col2 Col3 Col4 ABC 700 800 900 DEF 300 400 1000 HIJ 500 600 0 KLM 1100 1200 0 

Where the first row in the original Matrix2 has been replaced by the corresponding row from Matrix1 and the second rows from Matrix1 and Matrix2 have both been appended at the end. 原始Matrix2中的第一行已替换为Matrix1中的相应行,而Matrix1和Matrix2中的第二行均已附加在末尾。 The extra column in Matrix2 (Col4) is unadjusted when combining the matrices. 合并矩阵时,未调整Matrix2(Col4)中的额外列。 Also, the matrices have different dimensions. 而且,矩阵具有不同的尺寸。

Any help would be great! 任何帮助将是巨大的!

Thanks 谢谢

Mike 麦克风

Perhaps a solution with rbind and duplicated might be of use: 也许具有rbindduplicated的解决方案可能有用:

m1m2 <- rbind(Matrix2, Matrix1)
m1m2[!duplicated(m1m2$Col1), ]
#   Col1 Col2 Col3
# 1  HIJ  500  600
# 2  ABC  700  800
# 4  DEF  300  400

Is the resulting order of "Col1" important? 得到的“ Col1”顺序重要吗?


Update 更新资料

Based on your update, perhaps you can look for an option from the "reshape2" package combining melt , merge , duplicated , and dcast : 根据您的更新,也许您可​​以从“ reshape2”包中寻找一个结合了meltmergeduplicateddcast

library(reshape2)
M1 <- melt(Matrix1, id.vars="Col1") ## Convert your data into a "long" format
M2 <- melt(Matrix2, id.vars="Col1")
M1M2 <- merge(M1, M2, all = TRUE)   ## Merge this long data
dcast(M1M2[!duplicated(M1M2[1:2], fromLast=TRUE), ], Col1 ~ variable, fill=0)
#   Col1 Col2 Col3 Col4
# 1  ABC  700  800  900
# 2  DEF  300  400 1000
# 3  HIJ  500  600    0
# 4  KLM 1100 1200    0

My guess is that you might need to add another variable in here to identify the source data.frame to ensure you are taking the correct data in the last step. 我的猜测是,您可能需要在此处添加另一个变量以标识源data.frame以确保在最后一步中获取正确的数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM