[英]R:combining common rows in two matrices
I have two matrices. 我有两个矩阵。 I would like to combine them such that any element which is in Matrix1 but not in Matrix2 (scenario1) is added to the end of Matrix2.
我想将它们组合起来,以便将Matrix1中但不在Matrix2中的任何元素(场景1)添加到Matrix2的末尾。 However, if an element is in both Matrix1 and Matrix2 (scenario2), then I would like to overwrite certain columns in that row of Matrix2 with the columns for the corresponding row of Matrix1.
但是,如果一个元素同时在Matrix1和Matrix2中(方案2),那么我想用Matrix1的相应行的列覆盖Matrix2的该行中的某些列。
I have taken a look at ddply and merge, which seem to satisfy scenario1, but I can't solve the problem regarding scenario2. 我看了看ddply和merge,它们似乎满足了方案1,但是我无法解决有关方案2的问题。
An example: 一个例子:
The original matrices: Matrix2 原始矩阵:Matrix2
Col1 Col2 Col3 Col4 ABC 100 200 900 DEF 300 400 1000
Matrix1 矩阵1
Col1 Col2 Col3 HIJ 500 600 ABC 700 800 KLM 1100 1200
The new Matrix2: 新的Matrix2:
Col1 Col2 Col3 Col4 ABC 700 800 900 DEF 300 400 1000 HIJ 500 600 0 KLM 1100 1200 0
Where the first row in the original Matrix2 has been replaced by the corresponding row from Matrix1 and the second rows from Matrix1 and Matrix2 have both been appended at the end. 原始Matrix2中的第一行已替换为Matrix1中的相应行,而Matrix1和Matrix2中的第二行均已附加在末尾。 The extra column in Matrix2 (Col4) is unadjusted when combining the matrices.
合并矩阵时,未调整Matrix2(Col4)中的额外列。 Also, the matrices have different dimensions.
而且,矩阵具有不同的尺寸。
Any help would be great! 任何帮助将是巨大的!
Thanks 谢谢
Mike 麦克风
Perhaps a solution with rbind
and duplicated
might be of use: 也许具有
rbind
和duplicated
的解决方案可能有用:
m1m2 <- rbind(Matrix2, Matrix1)
m1m2[!duplicated(m1m2$Col1), ]
# Col1 Col2 Col3
# 1 HIJ 500 600
# 2 ABC 700 800
# 4 DEF 300 400
Is the resulting order of "Col1" important? 得到的“ Col1”顺序重要吗?
Based on your update, perhaps you can look for an option from the "reshape2" package combining melt
, merge
, duplicated
, and dcast
: 根据您的更新,也许您可以从“ reshape2”包中寻找一个结合了
melt
, merge
, duplicated
和dcast
:
library(reshape2)
M1 <- melt(Matrix1, id.vars="Col1") ## Convert your data into a "long" format
M2 <- melt(Matrix2, id.vars="Col1")
M1M2 <- merge(M1, M2, all = TRUE) ## Merge this long data
dcast(M1M2[!duplicated(M1M2[1:2], fromLast=TRUE), ], Col1 ~ variable, fill=0)
# Col1 Col2 Col3 Col4
# 1 ABC 700 800 900
# 2 DEF 300 400 1000
# 3 HIJ 500 600 0
# 4 KLM 1100 1200 0
My guess is that you might need to add another variable in here to identify the source data.frame
to ensure you are taking the correct data in the last step. 我的猜测是,您可能需要在此处添加另一个变量以标识源
data.frame
以确保在最后一步中获取正确的数据。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.