简体   繁体   中英

Mutate time variables with conditions

I have been unsuccessfully trying to form a new variable using two dates and conditions. Specifically, I have a dataframe like this:

  ID    MedicationDate   NumberOfPackages
 X001   2011-01-12             3    
 X001   2011-01-12             3    
 X001   2011-01-12             3    
 X001   2013-02-23             1    
 X001   2013-03-02             1    

Where MedicationDate is the date of buying medicine and NumberOfPackages is the amount of packages bought on this date. What I need is a new variable that says how long will the medicine last, under the assumption that one package lasts one month. When packages are bought after the last one ends the case is easy and new variable values are as expected. I get this result with my code but in the last row the outcome is not as expected:

  ID    MedicationDate   NumberOfPackages     LastDate
 X001   2011-01-12             3             2011-04-12
 X001   2011-01-12             3             2011-04-12
 X001   2011-01-12             3             2011-04-12
 X001   2013-02-23             1             2013-03-23
 X001   2013-03-02             1             2013-04-02

Since the last medication is bought before the previous one has run out, the last date should be 2013-04-23. I can get the right answer by running this code:

as.Date("2013-03-02") %m+% months(1) %m+% days(as.numeric(difftime(as.Date("2013-03-23"),as.Date("2013-03-02"), units = "days")))

But trying to use it with conditions for the whole dataframe doesn't seem to work.

library(lubridate)

test <- test %>%
  group_by(ID) %>%
  arrange(MedicationDate) %>%
  mutate(LastDate =
           case_when(
             lag(MedicationDate) == MedicationDate | is.na(lag(LastDate)) | lag(LastDate) <= MedicationDate ~ as.Date(MedicationDate) %m+% months(NumberOfPackages),
             TRUE ~ as.Date(MedicationDate) %m+% months(NumberOfPackages) %m+% days(as.numeric(difftime(as.Date(lag(LastDate)),as.Date(MedicationDate), units = "days"))) 
           ) 
  ) 

Seems that the LastDate value is always calculated by the first formula. I appreciate any help on how to find the needed value.

You can use months like so:

library(lubridate)
library(dplyr)
df %>% 
  mutate(LastDate = MedicationDate + months(NumberOfPackages))

#     ID MedicationDate NumberOfPackages   LastDate
# 1 X001     2011-01-12                3 2011-04-12
# 2 X001     2011-01-12                3 2011-04-12
# 3 X001     2011-01-12                3 2011-04-12
# 4 X001     2013-02-23                1 2013-03-23
# 5 X001     2013-03-02                1 2013-04-02

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM