简体   繁体   中英

Date Data in R using dplyr

I have data that contains two columns of date data for each subject, say date1 and date2, where date1 < date2. How do I create a variable indicating whether or not the next value for date1 for a given subject, comes before the current value of date 2? For the following data, for example:

subject date1      date2      
1       2018-01-01 2019-01-01
1       2018-02-01 2019-01-01
1       2020-01-01 2021-01-01

the indicator variable should be 1 for the first row, 0 for the second, and NA for the third.

We can use lead to do the comparison of next 'date1' with the current 'date2' after grouping by 'subject'

library(dplyr)
df1 %>%
   group_by(subject) %>% 
   mutate(new = as.integer(lead(date1) < date2))
# A tibble: 3 x 4
# Groups:   subject [1]
#  subject date1      date2       new
#    <int> <date>     <date>     <int>
#1       1 2018-01-01 2019-01-01     1
#2       1 2018-02-01 2019-01-01     0
#3       1 2020-01-01 2021-01-01    NA

data

df1 <- structure(list(subject = c(1L, 1L, 1L), date1 = structure(c(17532, 
 17563, 18262), class = "Date"), date2 = structure(c(17897, 17897, 
  18628), class = "Date")), .Names = c("subject", "date1", "date2"
 ), row.names = c(NA, -3L), class = "data.frame")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM