![](/img/trans.png)
[英]How to determine difference in days between two dates across two columns and two rows by group?
[英]Difference in dates between two columns by group
假设我有以下数据集:
id strt_dt end_dt
1 2013-05-07 2013-05-13
1 2013-05-14 2013-05-20
1 2013-05-21 2013-05-27
2 2013-05-14 2013-05-15
2 2013-05-16 2013-05-22
2 2013-05-23 2013-05-29
我想计算每个“ID”的结束日期和开始日期之间的天数差异:
id strt_dt end_dt diff
1 2013-05-07 2013-05-13 NA
1 2013-05-14 2013-05-20 1
1 2013-05-21 2013-05-27 1
2 2013-05-14 2013-05-15 NA
2 2013-05-16 2013-05-22 1
2 2013-05-29 2013-05-29 7
目标是通过“ID”对除每个“ID”的第一个观察值之外的每个观察值获取“strt_dt”和前一行“end_dt”之间的差异。
这可以使用dplyr
库中的lag
并通过应用group_by
为每个id制作它来实现,如下所示:
mutate
:创建一个新列
difftime
:查找日期之间的差异(基于指定的单位)
new_df <- df %>%
group_by(id) %>%
mutate(diff = difftime(strt_dt, lag(end_dt), units = "days"))
这应该给你以下内容:
id strt_dt end_dt diff
1 1 2013-05-07 2013-05-13 NA days
2 1 2013-05-14 2013-05-20 1 days
3 1 2013-05-21 2013-05-27 1 days
4 2 2013-05-14 2013-05-15 NA days
5 2 2013-05-16 2013-05-22 1 days
6 2 2013-05-29 2013-05-29 7 days
如果你想去除单词days
,你可以将差异结果转换为数字,如下所示:
new_df <- df %>%
group_by(id) %>%
mutate(diff = as.numeric(difftime(strt_dt, lag(end_dt), units = "days")))
希望能帮助到你。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.