如何删除R数据帧中事件的连续出现？

Question

I have a R dataframe containing date info about generic events: id;start_date;end_date. 我有一个R数据框，其中包含有关通用事件的日期信息：id; start_date; end_date。

Sometimes the same event may occur the same day (1) or at a distance of one day (2), for example: 有时， 同一事件可能在同一天（1）或相距一天（2）发生，例如：

(1) 1001;2016-05-07;2016-05-11 1001;2016-05-11;2016-05-14 （1）1001; 2016-05-07; 2016-05-11 1001; 2016-05-11; 2016-05-14

(2) 1001;2016-05-07;2016-05-11 1001;2016-05-12;2016-05-14 （2）1001; 2016-05-07; 2016-05-11 1001; 2016-05-12; 2016-05-14

In the first case the event "1001" ends and restarts the same day, while in the second case that event ends on 2017-05-11 and starts again the day after. 在第一种情况下，事件“ 1001”结束并在同一天重新开始，而在第二种情况下，事件在2017-05-11结束并在第二天再次开始。 I'd like to delete the second occurrence of the event in both cases. 在这两种情况下，我都想删除该事件的第二次出现。 If the second occurrence is at a distance of two or more days, it's ok to preserve the second occurrence. 如果第二次出现距离为两天或更长时间，则可以保留第二次出现。 How can I do this in R? 我如何在R中做到这一点？

Thank you in advance. 先感谢您。

Answer 1

Partial solution with my guess of how data look like: 我对数据看起来像的部分解决方案：

library(data.table)
dat <- data.table(id = c(1001,1001,1001,1001),
                  start_date = as.Date(c("2016-05-07", "2016-05-11", "2016-05-07", "2016-05-12")),
                  end_date = as.Date(c("2016-05-11", "2016-05-14", "2016-05-11", "2016-05-14")))

dat2 <- data.table(id = c(dat$id, NA),
                   start_date = c(dat$start_date, NA),
                   end_date = c(as.Date(NA), dat$end_date))

dat2[, dif := end_date - start_date]

Then you can just remove rows with dif <= 0 I guess. 然后，您可以删除dif <= 0行，我想。

I've used the data.table package, but you can just do dat2$dif <- dat2$end_date - dat2$start_date . 我使用了data.table包，但是您可以执行dat2$dif <- dat2$end_date - dat2$start_date 。

如何删除R数据帧中事件的连续出现？

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-05-10 11:35:23

如何删除R数据帧中事件的连续出现？

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-05-10 11:35:23

解决方案1
1 已采纳 2016-05-10 11:35:23