I want to combine date columns by the latest date per row (if it is different) but with keeping the ID column. My data frame looks like the image. I want to keep the NA rows. As you can see in some rows the timestamp_c is filled and not the timestamp (in some others, it is the opposite). I want to keep the column which is completed and not the NA. I tried to follow this but I could not find a solution
library(data.table)
df <- data.table(
ID = LETTERS[1:7],
timestamp_c = lubridate::ymd("2021-03-08", NA, NA, "2021-04-06", NA, "2021-04-06", "2021-04-07"),
timestamp = lubridate::ymd(NA, NA, NA, "2021-04-06", "2021-05-05", "2021-04-07", "2021-04-06")
)
df[, new_timestamp := max(timestamp_c, timestamp, na.rm = TRUE), by = ID]
# ID timestamp_c timestamp new_timestamp
# 1: A 2021-03-08 <NA> 2021-03-08
# 2: B <NA> <NA> <NA>
# 3: C <NA> <NA> <NA>
# 4: D 2021-04-06 2021-04-06 2021-04-06
# 5: E <NA> 2021-05-05 2021-05-05
# 6: F 2021-04-06 2021-04-07 2021-04-07
# 7: G 2021-04-07 2021-04-06 2021-04-07
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.