I have a data frame which looks as follows:
id OrderDate_1 OrderDate_2 OrderDate_3 NewEnrollDate
1 05/01/2018 01/02/2019 NA 02/15/2019
2 03/02/2019 NA NA 05/05/2019
3 12/15/2017 12/12/2018 05/01/2019 06/01/2019
I want a logic that goes through each record of data frame and flags the record which is true for following logic
NewEnrollDate >= OrderDate_X and OrderDate_X is nearest to NewEnrollDate
it should also return me the OrderDate_X which passed through the logic above and give me a following table
id OrderDate_1 OrderDate_2 OrderDate_3 NewEnrollDate MatchDT
1 05/01/2018 01/02/2019 NA 02/15/2019 01/02/2019
2 03/02/2019 NA NA 05/05/2019 03/02/2019
3 12/15/2017 12/12/2018 05/01/2019 06/01/2019 05/01/2019
Also, if it has an additional column to flag the records where the records passed the logic of NewEnrollDate >= OrderDate_X
I have tried to use difference between the dates and get min of them but it does not seem to work with NA values to well and it also does not return me the MatchDT variable. Please help.
I managed to do this by using {data.table}.
I have read your concerns about having multiple columns (more than 3) of order dates. In this case, I have used some sort of matching to capture all the columns with the pattern of "OrderDate".
For each of those column, I created a new column having the order date if it is less than or equal to NewEnrollDate, and NA otherwise.
From these new columns, I then proceed to get their maximum, with the parameter na.rm = T, to handle missing values.
library(data.table)
DT <-
data.table(id = c(1:3),
OrderDate_1 = as.POSIXct("2018-05-01", "2019-03-02", "2017-12-15"),
OrderDate_2 = as.POSIXct("2019-01-02", NA, "2018-12-12"),
OrderDate_3 = as.POSIXct(NA, NA, "2019-05-01"),
NewEnrollDate = as.POSIXct("2019-02-15", "2019-05-05", "2019-06-01"))
OldNames <- names(DT)[grepl("OrderDate", names(DT))]
NewNames <- paste0(OldNames, "New")
for(i in 1:length(OldNames)){
setnames(DT, OldNames[i], "PlaceHolder1")
DT[NewEnrollDate >= PlaceHolder1, PlaceHolder2 := PlaceHolder1]
setnames(DT, "PlaceHolder1", OldNames[i])
setnames(DT, "PlaceHolder2", NewNames[i])
}
DT[, MatchDT := pmax(OrderDate_1New, OrderDate_2New, OrderDate_3New, na.rm = T)]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.