简体   繁体   中英

How do I calculate a rolling sum with variable levels of depletion in R?

I'm trying to use R to measure how many days supply of a prescription an individual already has on-hand when they make a refill, taking into account all previous prescriptions. For example, if I had this table...

  member rx_id  fill_date    to_date days_supply
1      A     1 2018-10-01 2018-10-02           2
2      B     1 2016-11-07 2016-11-10           4
3      B     2 2016-11-07 2016-12-04          28
4      B     3 2016-11-08 2016-11-09           2
5      B     4 2016-11-10 2016-12-03          24

I'd expect the following output

  member rx_id  fill_date    to_date days_supply_on_hand
1      A     1 2018-10-01 2018-10-02                   0
2      B     1 2016-11-07 2016-11-10                   0
3      B     2 2016-11-07 2016-12-04                   4
4      B     3 2016-11-08 2016-11-09                  30
5      B     4 2016-11-10 2016-12-03                  26

For member B, when the second script is filled on the same day as the first script, the individual already has 4 days worth of RX on hand. When the third script is filled, the individual has 3 days left from the first script and 27 left from the second (30 total). When the fourth script is filled, the third script is depleted, but there is 1 day left from the first script and 25 from the third script (26 total).

I know how to do rolling totals in both dplyr and data.table, but I can't figure out how to take into account variable levels of depletion based on previous records on an individual by individual basis. Below is code to remake the original table, thanks in advance for any suggestions!

structure(list(member = structure(c(1L, 2L, 2L, 2L, 2L), .Label = 
c("A", 
"B"), class = "factor"), rx_id = c(1, 1, 2, 3, 4), fill_date = 
structure(c(17805, 
17112, 17112, 17113, 17115), class = "Date"), to_date = 
structure(c(17806, 
17115, 17139, 17114, 17138), class = "Date"), days_supply = c(2, 
4, 28, 2, 24)), .Names = c("member", "rx_id", "fill_date", 
"to_date", 
"days_supply"), row.names = c(NA, -5L), class = "data.frame")
library(data.table)
dt = as.data.table(your_df) # or setDT to convert in place

# merge on relevant days, then compute sum of supply - days elapsed
dt[dt, on = .(member, fill_date <= fill_date, to_date >= fill_date, rx_id < rx_id), by = .EACHI,
   sum(days_supply, na.rm = T) - sum(i.fill_date - x.fill_date, na.rm = T)]
#   member  fill_date    to_date rx_id      V1
#1:      A 2018-10-01 2018-10-01     1  0 days
#2:      B 2016-11-07 2016-11-07     1  0 days
#3:      B 2016-11-07 2016-11-07     2  4 days
#4:      B 2016-11-08 2016-11-08     3 30 days
#5:      B 2016-11-10 2016-11-10     4 26 days

Using a simple loop

dt$days_supply_on_hand <- 0
for (a in unique(dt$member)) {
  I <- which(.subset2(dt,1) == a)
  flDate <- as.integer(.subset2(dt,3)[I])
  toDate <- as.integer(.subset2(dt,4)[I])
  V <- vapply(seq_along(I), function (k) sum(toDate[1:(k-1)] - flDate[k] + 1), numeric(1))
  dt$days_supply_on_hand[I] <- c(0,V[-1])
}
dt
  member rx_id  fill_date    to_date days_supply days_supply_on_hand
1      A     1 2018-10-01 2018-10-02           2                   0
2      B     1 2016-11-07 2016-11-10           4                   0
3      B     2 2016-11-07 2016-12-04          28                   4
4      B     3 2016-11-08 2016-11-09           2                  30
5      B     4 2016-11-10 2016-12-03          24                  26

where dt is data frame provided above. (Note that the use of .subset2 or as.integer is for efficiency purposes - they can be changed for more readability).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM