When doing a non-equi inner join, should the order of X[Y] and Y[X] matters? I am under the impression that it should not.
library(data.table) #data.table_1.12.2
dt1 <- data.table(ID=LETTERS[1:4], TIME=2L:5L)
cols1 <- names(dt1)
dt2 <- data.table(ID=c("A", "B"), START=c(1L, 20L), END=c(3L, 30L))
cols2 <- names(dt2)
> dt1
ID TIME
1: A 2
2: B 3
3: C 4
4: D 5
> dt2
ID START END
1: A 1 3
2: B 20 30
I am trying to filter for rows in dt1
such that 1) ID matches and 2) dt1$TIME lies between dt2$START and dt2$END. Desired output:
ID TIME
1: A 2
Since I wanted rows from dt1
, I started with using dt1
as i
in data.table[
but I am getting either columns from dt2
or encountered errors:
#no error but using x. values
dt2[dt1, on=.(ID, START<TIME, END>TIME), nomatch=0L]
#error for the rest
dt2[dt1, on=.(ID, START<TIME, END>TIME), nomatch=0L, mget(paste0("i.", cols1))]
dt2[dt1, on=.(ID, START<TIME, END>TIME), nomatch=0L, .SD]
dt2[dt1, on=.(ID, START<TIME, END>TIME), nomatch=0L, .(START)]
Error message:
Error in
[.data.table
(dt2, dt1, on = .(ID, START < TIME, END > TIME), : column(s) not found: START
So I had to use dt2
as the i
as a workaround:
#need to type out all the columns:
dt1[dt2, on=.(ID, TIME>START, TIME<END), nomatch=0L, .(ID, TIME=x.TIME)]
#using setNames
dt1[dt2, on=.(ID, TIME>START, TIME<END), nomatch=0L,
setNames(mget(paste0("x.", cols1)), cols1)]
Or is this a simple case of my misunderstanding?
References:
I am trying to filter for rows in
dt1
such that 1) ID matches and 2) dt1$TIME lies between dt2$START and dt2$END.
That sounds like a semi join: Perform a semi-join with data.table
dt1[
dt1[dt2, on=.(ID, TIME >= START, TIME <= END), nomatch=0, which=TRUE]
]
# ID TIME
# 1: A 2
If it's possible that multiple rows of dt2 will match rows of dt1, then the "which" output can be wrapped in unique()
as in the linked answer.
There are a couple linked feature requests for a more convenient way to do this: https://github.com/Rdatatable/data.table/issues/2158
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.