[英]Merging data.tables by 2 columns representing variably-ordered pairs
Suppose I have the following data.tables: 假设我有以下data.tables:
X1 X2 val1
A B 1
B C 2
C A 3
X1 X2 val2
A B 100
C B 200
A C 300
where every combination of (X1, X2) appears once in each data.table, but the ordering is uncertain. (X1,X2)的每个组合在每个data.table中出现一次,但顺序不确定。 I'm aiming for this output: 我的目标是输出:
X1 X2 val1 val2
A B 1 100
B C 2 200
C A 3 300
What's the most efficient way to do this? 最有效的方法是什么? Especially if there's a 3rd data.table containing a 3rd value column with the same situation re: X1, X2, etc. 特别是如果有一个包含第三个值列的第三个data.table,情况相同,例如:X1,X2等。
How about something like this? 这样的事情怎么样?
special_join <- function(x, y, xcols, ycols=xcols) {
ix1 = y[x, on=structure(xcols, names=ycols), which=TRUE]
ix2 = y[x, on=structure(rev(xcols), names=ycols), which=TRUE]
pmax(ix1, ix2, na.rm=TRUE)
}
ix = special_join(dt1, dt2, names(dt1)[1:2])
dt1[, val2 := dt2$val2[ix]]
where, 哪里,
dt1 = fread('X1 X2 val1
A B 1
B C 2
C A 3')
dt2 = fread('X1 X2 val2
A B 100
C B 200
A C 300')
I'll leave the part on adapting this for your 3rd data.table as an exercise. 作为练习,我将保留为您的第3个data.table进行调整的部分。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.