[英]How to create a new column that consecutively sums from another column in R?
I have a data table with 3 columns: customer_id, time_period and bought_cookies (0 if no, 1 if yes).我有一个包含 3 列的数据表:customer_id、time_period 和 buy_cookies(如果没有,则为 0,如果是,则为 1)。 I want to create a new column (total_number_cookie_buyers) that sums the previous rows of bought_cookies to see how many people have bought cookies up to that point in time (eg if the first three rows of bought_cookies were 0, 1, 0 then the value in the third row of total_number_cookie_buyers would be 1).
我想创建一个新列 (total_number_cookie_buyers) 对前几行的buyed_cookies 进行求和,以查看到该时间点有多少人购买了 cookie(例如,如果 buy_cookies 的前三行是 0、1、0,那么total_number_cookie_buyers 的第三行是 1)。 I've tried googling but can't find anything on how to do this!
我试过谷歌搜索,但找不到任何关于如何做到这一点的信息!
The approach you are looking for is called cumulative sum.您正在寻找的方法称为累积总和。 I think it is the solution.
我认为这是解决方案。
cust_id <- NULL
for(i in 1:21){
if(i<10){ k <- paste("ID_00",i,sep="") } else{
k <- paste("ID_0",i,sep="") }
cust_id[i] <- k
}
date <- sample(seq(as.Date('2020/01/01'), as.Date('2020/01/21'), by="day"), 21)
date <- date[order(date)]
sales <- rbinom(21,1,0.5)
df <- data.frame(cust_id=cust_id,date=date,sales=sales)
df$salesydate <- cumsum(df$sales)
cust_id date sales salesdate cust_id 日期 sales salesdate
1 ID_001 2020-01-01 0 0 1 ID_001 2020-01-01 0 0
2 ID_002 2020-01-02 0 0 2 ID_002 2020-01-02 0 0
3 ID_003 2020-01-03 0 0 3 ID_003 2020-01-03 0 0
4 ID_004 2020-01-04 1 1 4 ID_004 2020-01-04 1 1
5 ID_005 2020-01-05 1 2 5 ID_005 2020-01-05 1 2
6 ID_006 2020-01-06 0 2 6 ID_006 2020-01-06 0 2
7 ID_007 2020-01-07 1 3 7 ID_007 2020-01-07 1 3
................................... …………………………………………………………………………………………………………………………………………………………………………………………
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.