简体   繁体   English

使用dplyr的R中的日期数据

[英]Date Data in R using dplyr

I have data that contains two columns of date data for each subject, say date1 and date2, where date1 < date2. 我的数据包含每个主题的两列日期数据,例如date1和date2,其中date1 <date2。 How do I create a variable indicating whether or not the next value for date1 for a given subject, comes before the current value of date 2? 如何创建一个变量,以指示给定主题的date1的下一个值是否在date 2的当前值之前? For the following data, for example: 对于以下数据,例如:

subject date1      date2      
1       2018-01-01 2019-01-01
1       2018-02-01 2019-01-01
1       2020-01-01 2021-01-01

the indicator variable should be 1 for the first row, 0 for the second, and NA for the third. 指标变量的第一行应为1,第二行应为0,第三行应为NA。

We can use lead to do the comparison of next 'date1' with the current 'date2' after grouping by 'subject' 在按“主题”分组后,我们可以使用lead进行下一个“ date1”与当前“ date2”的比较

library(dplyr)
df1 %>%
   group_by(subject) %>% 
   mutate(new = as.integer(lead(date1) < date2))
# A tibble: 3 x 4
# Groups:   subject [1]
#  subject date1      date2       new
#    <int> <date>     <date>     <int>
#1       1 2018-01-01 2019-01-01     1
#2       1 2018-02-01 2019-01-01     0
#3       1 2020-01-01 2021-01-01    NA

data 数据

df1 <- structure(list(subject = c(1L, 1L, 1L), date1 = structure(c(17532, 
 17563, 18262), class = "Date"), date2 = structure(c(17897, 17897, 
  18628), class = "Date")), .Names = c("subject", "date1", "date2"
 ), row.names = c(NA, -3L), class = "data.frame")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM