Join DT1
(as i
in data.table
) to DT2
given key(s) column(s), within each group of DT2
specified by the Date
column.
I cannot run DT2[DT1, on = 'key']
as that would be incorrect since key
column is repeated across the Date
column, but unique within a single date.
DT3
is my expected output. Is there any way to achieve this without the split
manoeuvre, which does not feel very data.table
-y?
library(data.table)
set.seed(1)
DT1 <- data.table(
Segment = sample(paste0('S', 1:10), 100, TRUE),
Activity = sample(paste0('A', 1:5), 100, TRUE),
Value = runif(100)
)
dates <- seq(as.Date('2018-01-01'), as.Date('2018-11-30'), by = '1 day')
DT2 <- data.table(
Date = rep(dates, each = 5),
Segment = sample(paste0('S', 1:10), 3340, TRUE),
Total = runif(3340, 1, 2)
)
rm(dates)
# To ensure that each Date Segment combination is unique
DT2 <- unique(DT2, by = c('Date', 'Segment'))
iDT2 <- split(DT2, by = 'Date')
iDT2 <- lapply(
iDT2,
function(x) {
x[DT1, on = 'Segment', nomatch = 0]
}
)
DT3 <- rbindlist(iDT2, use.names = TRUE)
You can achieve the same result with a cartesian merge
:
DT4 <- merge(DT2,DT1,by='Segment',allow.cartesian = TRUE)
Here is the proof:
> all(DT3[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')] ==
DT4[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')])
[1] TRUE
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.