[英]Successive sum of column only if another column has the good value in R
I currently have a dataframe looking like that (with time in seconds and Zone1 a boolean): 我目前有一个看起来像这样的数据框(时间以秒为单位,Zone1为布尔值):
Time Zone1
1 0
3 0
4 1
5 1
6 1
7 0
9 1
10 1
I'd like to have the sum of values for successive criteria so I would get something like this: 我想要连续条件的值之和,所以我会得到如下信息:
Time Zone1 TimeInZone
1 0 NA
3 0 NA
4 1 2
5 1 2
6 1 2
7 0 NA
9 1 1
10 1 1
So like this 像这样
I can't find what to do, how can I deal with that? 我找不到该怎么办,该如何处理? Thanks. 谢谢。
EDITED: More accurate dataframe 编辑:更准确的数据框
I'm not entirely sure, where the last two rows came from, but here's my take on it: 我不确定最后两行来自哪里,但这是我的看法:
library(data.table)
df <- data.table(Value=c(3,4,1,1,2), Criteria=c(1,1,2,1,3))
# First, generate a logical vector that indicates if the criterium changed:
df[, changed:=c(TRUE, Criteria[-1] != Criteria[-length(Criteria)])]
# Then, calculate the cumulative sum to get an index:
df[, index:=cumsum(changed)]
# Calculate the sum for each level of index:
df[, Sum:=sum(Value), by=index]
# print everything:
print(df)
Result: 结果:
Value Criteria changed index Sum
1: 3 1 TRUE 1 7
2: 4 1 FALSE 1 7
3: 1 2 TRUE 2 1
4: 1 1 TRUE 3 1
5: 2 3 TRUE 4 2
To have the sum of the last block, use some data.table magic: 要获得最后一个块的总和,请使用一些data.table魔术:
setkey(df, index)
nextblocksums <- df[index!=max(index), .(index=index+1,nextBlockSum=Sum)]
df[ nextblocksums , LastBlocksSum:=i.nextBlockSum]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.