简体   繁体   中英

Calculate number of days between two dates in r

I need to calculate the number of days elapsed between multiple dates in two ways and then output those results to new columns: i) number of days that has elapsed as compared to the first date (eg, RESULTS$FIRST) and ii) between sequential dates (eg, RESULTS$BETWEEN). Here is an example with the desired results. Thanks in advance.

library(lubridate)

DATA = data.frame(DATE = mdy(c("7/8/2013",  "8/1/2013", "8/30/2013", "10/23/2013", 
                                   "12/16/2013", "12/16/2015")))

RESULTS  = data.frame(DATE = mdy(c("7/8/2013",  "8/1/2013", "8/30/2013", "10/23/2013", 
                                       "12/16/2013", "12/16/2015")), 
                  FIRST = c(0, 24, 53, 107, 161, 891), BETWEEN = c(0, 24, 29, 54, 54, 730))
#Using dplyr package
library(dplyr)
df1 %>%  # your dataframe
mutate(BETWEEN0=as.numeric(difftime(DATE,lag(DATE,1))),BETWEEN=ifelse(is.na(BETWEEN0),0,BETWEEN0),FIRST=cumsum(as.numeric(BETWEEN)))%>%
select(-BETWEEN0)
            DATE BETWEEN FIRST
    1 2013-07-08       0     0
    2 2013-08-01      24    24
    3 2013-08-30      29    53
    4 2013-10-23      54   107
    5 2013-12-16      54   161
    6 2015-12-16     730   891

This will get you what you want:

d <- as.Date(DATA$DATE, format="%m/%d/%Y")

first <- c()
for (i in seq_along(d))
    first[i] <- d[i] - d[1]

between <- c(0, diff(d))

This uses the as.Date() function in the base package to cast the vector of string dates to date values using the given format. Since you have dates as month/day/year, you specify format="%m/%d/%Y" to make sure it's interpreted correctly.

diff() is the lagged difference. Since it's lagged, it doesn't include the difference between element 1 and itself, so you can concatenate a 0.

Differences between Date objects are given in days by default.

Then constructing the output dataframe is simple:

RESULTS <- data.frame(DATE=DATA$DATE, FIRST=first, BETWEEN=between)

For the first part:

DATA = data.frame((c("7/8/2013",  "8/1/2013", "8/30/2013", "10/23/2013","12/16/2013", "12/16/2015")))
names(DATA)[1] = "V1"
date = as.Date(DATA$V1, format="%m/%d/%Y")
print(date-date[1])

Result:

[1]   0  24  53 107 161 891

For second part - simply use a for loop

You can just add each column with the simple difftime and lagged diff calculations.

DATA$FIRST <- c(0, 
                with(DATA, 
                     difftime(DATE[2:length(DATE)],DATE[1], unit="days")
                     )
                )
DATA$BETWEEN <- c(0, 
                  with(DATA, 
                       diff(DATE[1:(length(DATE) - 1)], unit="days")
                       )
                  )

identical(DATA, RESULTS)
[1] TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM