I have a large data table that contains start and end dates of events per ID:
library(data.table)
dt = data.table(
ID = c(1,1,2,2),
STARTDATE = as.Date(c("2011-10-10","2011-10-13","2011-10-10","2011-10-13"),format = "%Y-%m-%d"),
ENDDATE = as.Date(c("2011-10-12","2011-10-15","2011-10-12","2011-10-15"),format = "%Y-%m-%d")
)
dt
> ID STARTDATE ENDDATE
>1: 1 2011-10-10 2011-10-12
>2: 1 2011-10-13 2011-10-15
>3: 2 2011-10-10 2011-10-12
>4: 2 2011-10-13 2011-10-15
I would like to add rows for each ID and day in the time windows to the this data table with the expected result as follows:
STARTDATE ENDDATE ID DAILY
1: 2011-10-10 2011-10-12 1 2011-10-10
2: 2011-10-10 2011-10-12 1 2011-10-11
3: 2011-10-10 2011-10-12 1 2011-10-12
4: 2011-10-13 2011-10-15 1 2011-10-13
5: 2011-10-13 2011-10-15 1 2011-10-14
6: 2011-10-13 2011-10-15 1 2011-10-15
7: 2011-10-10 2011-10-12 2 2011-10-10
8: 2011-10-10 2011-10-12 2 2011-10-11
9: 2011-10-10 2011-10-12 2 2011-10-12
10: 2011-10-13 2011-10-15 2 2011-10-13
11: 2011-10-13 2011-10-15 2 2011-10-14
12: 2011-10-13 2011-10-15 2 2011-10-15
My code looks as follows:
dt[, cbind(.SD, seq(STARTDATE, ENDDATE, 1)), by = list(STARTDATE, ENDDATE)]
but it does not generate the wanted result:
STARTDATE ENDDATE ID V2
1: 2011-10-10 2011-10-12 1 2011-10-10
2: 2011-10-10 2011-10-12 2 2011-10-11
3: 2011-10-10 2011-10-12 1 2011-10-12
4: 2011-10-13 2011-10-15 1 2011-10-13
5: 2011-10-13 2011-10-15 2 2011-10-14
6: 2011-10-13 2011-10-15 1 2011-10-15
Warnmeldungen:
1: In data.table::data.table(...) :
Item 1 is of size 2 but maximum size is 3 (recycled leaving remainder of 1 items)
2: In data.table::data.table(...) :
Item 1 is of size 2 but maximum size is 3 (recycled leaving remainder of 1 items)
It needs the ID somewhere but I cannot enter it into the by
part of the data table. It gives another error. Any ideas?
Here is an option. Notice that we can use by = 1:nrow(dt)
to specify the grouping is for each row, which leads to a new column called nrow
. We can then use [, nrow := NULL]
to remove that column.
library(data.table)
dt2 <- dt[, .(STARTDATE, ENDDATE, ID,
DAILY = seq(STARTDATE, ENDDATE, by = 1)),
by = 1:nrow(dt)][, nrow := NULL]
print(dt2[])
# STARTDATE ENDDATE ID DAILY
# 1: 2011-10-10 2011-10-12 1 2011-10-10
# 2: 2011-10-10 2011-10-12 1 2011-10-11
# 3: 2011-10-10 2011-10-12 1 2011-10-12
# 4: 2011-10-13 2011-10-15 1 2011-10-13
# 5: 2011-10-13 2011-10-15 1 2011-10-14
# 6: 2011-10-13 2011-10-15 1 2011-10-15
# 7: 2011-10-10 2011-10-12 2 2011-10-10
# 8: 2011-10-10 2011-10-12 2 2011-10-11
# 9: 2011-10-10 2011-10-12 2 2011-10-12
# 10: 2011-10-13 2011-10-15 2 2011-10-13
# 11: 2011-10-13 2011-10-15 2 2011-10-14
# 12: 2011-10-13 2011-10-15 2 2011-10-15
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.