简体   繁体   中英

R function to determine the overlap between two date intervals — DescTools Interval Function

I'm looking for a way to determine the overlap (in days) between two date intervals. I have columns startdate1, enddate1, startdate2, enddate2. I want an additional column with the number of days of overlap between the intervals (startdate1, enddate1) and (startdate2, enddate2).
For example, I want to end up with something like this:

startdate1      enddate1      startdate2      enddate2      overlap
1/1/2020        1/10/2020     1/6/2020        1/16/2020     5
1/15/2020       1/29/2020     1/6/2020        1/20/2020     6
1/15/2020       1/29/2020     1/17/2020       1/20/2020     4

I've been trying to achieve this with the Interval function from the DescTools package:

df1$overlap<- Interval(as.Date(c(df1$startdate1, df1$enddate1)), as.Date(c(df1$startdate2, df1$enddate2)))

But I get the error Error:

as.Date.numeric(c(df1$startdate1, df1$enddate1)): 'origin' must be supplied"

I have also looked into the lubridate package-- I used the interval function (different from the DescTools Interval above) to create columns interval1 and interval2, but I'm not aware of a function that can calculate the days of overlap between them.

Any help is appreciated. Thanks in advance!

If I understand correctly, you can achieve this by simply subtracting enddate1 from startdate2 . You can do this with base R functions like as.Date() :

as.Date(enddate1, "%m/%d/%Y") - as.Date(startdate2, "%m/%d/%Y") + 1

The string %m/%d/%Y specifies the format of your dates, in your case month/day/year. I add the +1 because the above calculates the difference between dates (like 10 - 6 = 4 ), not including the start date, while to get the overlap we want to also count the start day itself.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM