简体   繁体   中英

How to create a new column that consecutively sums from another column in R?

I have a data table with 3 columns: customer_id, time_period and bought_cookies (0 if no, 1 if yes). I want to create a new column (total_number_cookie_buyers) that sums the previous rows of bought_cookies to see how many people have bought cookies up to that point in time (eg if the first three rows of bought_cookies were 0, 1, 0 then the value in the third row of total_number_cookie_buyers would be 1). I've tried googling but can't find anything on how to do this!

The approach you are looking for is called cumulative sum. I think it is the solution.

cust_id <- NULL
 for(i in 1:21){
  if(i<10){ k <- paste("ID_00",i,sep="") } else{
    k <- paste("ID_0",i,sep="") }
  cust_id[i] <- k 
}
date <- sample(seq(as.Date('2020/01/01'), as.Date('2020/01/21'), by="day"), 21)
date <- date[order(date)]
sales <- rbinom(21,1,0.5)
df <- data.frame(cust_id=cust_id,date=date,sales=sales)
df$salesydate <- cumsum(df$sales)

cust_id date sales salesdate

1 ID_001 2020-01-01 0 0

2 ID_002 2020-01-02 0 0

3 ID_003 2020-01-03 0 0

4 ID_004 2020-01-04 1 1

5 ID_005 2020-01-05 1 2

6 ID_006 2020-01-06 0 2

7 ID_007 2020-01-07 1 3

...................................

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM