Cumsum with a reset and a delay

Question

I'm looking for a solution to calculating delinquency buckets. I have figured out the part to reset the cumsum but am stuck on how to "delay" the cumsum based on a trigger; see my example of what I would like to do where my desired result is correct_bucket:

df <- data.frame(id = c(1,1,1,1,2,2,3,3,3,3,4,4,4,4,5,5,5,5,5,5,5,5,5,5,6,6,6,6,7,7,7,7,7,8,8,8,8),
             min_due = c(25,50,50,75,25,50,25,50,25,25,25,50,75,100,25,50,75,100,100,25,50,25,14.99,0,25,60,60,0,25,50,75,100,75,25,50,25,50),
             payment = c(0,0,25,0,0,0,0,0,50,25,0,0,0,0,0,0,0,0,25,100,0,150,25,14.99,0,25,60,60,0,0,0,0,50,0,0,25,0),
             past_due_amt = c(0,25,25,50,0,25,0,25,0,0,0,25,50,75,0,25,50,75,75,0,25,0,0,0,0,0,0,0,0,25,50,75,50,0,25,0,25),
             correct_bucket = c(0,1,1,2,0,1,0,1,0,0,0,1,2,3,0,1,2,3,3,0,1,0,0,0,0,0,0,0,0,1,2,3,2,0,1,0,1))

Explanation of correct_bucket: It indicates that, by ID, the min_due was satisfied (or not) by the payment being greater than or equal to the previous (lag-1) min_pay. So for example: ID#1 has a min_due of 25 (on row 1), and a payment of 0 (row 2), thus correct_bucket = 1. As you can see, in each example, the value of correct bucket needs to iterate both up and down depending on whether a payment was made and how much.

Thoughts? Please ask any clarifying questions you need, I'm suuuuuper close and any additional help is appreciated!

Thanks!

Answer 1

df$original_order = 1:nrow(df) #In case you need later. OPTIONAL

#Obtain the incremental min_due for each id
df$b2 = unlist(lapply(split(df, df$id), function(a) c(0, diff(a$min_due)))) 

#Function to get your values from incremental min_due
ff = function(x){
x$b3 = 0
    for (i in 2:NROW(x)){
        if (x$b2[i] > 0){
            x$b3[i] = x$b3[i-1] + 1
        }
        if (x$b2[i] == 0){
            x$b3[i] = x$b3[i-1]
        }
        if (x$b2[i] < 0){
            x$b3[i] = 0
        }
    }
    return(x)
}

#Split df by id and use the above function on each sub group
#'b3' is the value you want
do.call(rbind, lapply(split(df, df$id), function(a) ff(a)))

NEW ff

ff = function(x){
    x$b3 = 0

    if(NROW(x) < 2){
        return(x)
    }

    for (i in 2:NROW(x)){
        if (x$b2[i] > 0){
            x$b3[i] = x$b3[i-1] + 1
        }
        if (x$b2[i] == 0){
            x$b3[i] = x$b3[i-1]
        }
        if (x$b2[i] < 0){
            x$b3[i] = 0
        }
    }
    return(x)
}

Cumsum with a reset and a delay

Question

1 answers

solution1
1 2017-03-02 16:57:34

Cumsum with a reset and a delay

Question

1 answers

solution1 1 2017-03-02 16:57:34

solution1
1 2017-03-02 16:57:34