简体   繁体   中英

Combine date columns into one

I want to combine date columns by the latest date per row (if it is different) but with keeping the ID column. My data frame looks like the image. I want to keep the NA rows. As you can see in some rows the timestamp_c is filled and not the timestamp (in some others, it is the opposite). I want to keep the column which is completed and not the NA. I tried to follow this but I could not find a solution

在此处输入图片说明

library(data.table)

df <- data.table(
  ID = LETTERS[1:7],
  timestamp_c = lubridate::ymd("2021-03-08", NA, NA, "2021-04-06", NA, "2021-04-06", "2021-04-07"),
  timestamp = lubridate::ymd(NA, NA, NA, "2021-04-06", "2021-05-05", "2021-04-07", "2021-04-06")
)

df[, new_timestamp := max(timestamp_c, timestamp, na.rm = TRUE), by = ID]

#    ID timestamp_c  timestamp new_timestamp
# 1:  A  2021-03-08       <NA>    2021-03-08
# 2:  B        <NA>       <NA>          <NA>
# 3:  C        <NA>       <NA>          <NA>
# 4:  D  2021-04-06 2021-04-06    2021-04-06
# 5:  E        <NA> 2021-05-05    2021-05-05
# 6:  F  2021-04-06 2021-04-07    2021-04-07
# 7:  G  2021-04-07 2021-04-06    2021-04-07

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM