[英]Conditional Cumulative Sum over the same row in R
I have a dataset like this 我有一个像这样的数据集
dat <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
Col1 = rep(c("B","S","S","B"), 4),
Col2 = rep(c(1,2,3,4), 4),
Col3 = rep(c(0.1,0.2,0.3,0.4), 4))
I'm trying to create a fourth column as shown below 我正在尝试创建第四列,如下所示
dat1 <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
Col1 = rep(c("B","S","S","B"), 4),
Col2 = rep(c(1,2,3,4), 4),
Col3 = rep(c(0.1,0.2,0.3,0.4), 4),
Col4 = c(1, 0.8, 1.26, 4, 1, 0.8, 1.26, 4, 1, 0.8, 1.26, 4))
What I have tried till now, 到现在为止我一直在尝试
d1 <- dat %>%
group_by(Col0) %>%
mutate(Col4 = if_else(Col1 == 'B', Col2,
if_else(Col1 == 'S' & lag(Col1 == "B"), lag(Col2)- Col3*lag(Col2), 0)))
d1
The Answer I'm getting is not what is in Col4, which is desired. 我得到的答案不是Col4中所需要的。 The condition for getting Col4 is :
获得Col4的条件是:
if Col1 is B then get the value of Col2 as it is,
if Col1 is S & Previous Value of Col1 is B then 1-(0.2*1) which is equal to 0.8
if Col1 is S & Previous Value of Col1 is S as well then (1+0.8) -((1+0.8)*0.3) which is 1.26
Basically, it's like first performing difference and then performing cumulative sum including the difference and so on. 基本上,这就像先执行差异,然后执行包括该差异的累加总和,依此类推。
For now, I have taken a simple example to understand what I'm trying to achieve, the actual data-set has more than 1 million Obs. 现在,我仅举一个简单的例子来了解我要实现的目标,实际数据集已超过100万个Obs。 and Several Thousand Groups and what's worse is that the Combination of 'B' & 'S' alter.
还有数千个组,更糟糕的是“ B”和“ S”的组合发生了变化。 Like in some groups it's
B,B,S,S
and So on... 就像某些团体中的
B,B,S,S
等等...
Any Help on this will be appreciated as I have tried several things other than if_else()
and seen many conditional cumulative sum Ques as well but to no avail. 由于我尝试了除
if_else()
以外的其他事情,并且看到许多条件累积总和Ques,但无济于事,因此对此有所帮助。
I think the same could be done easily in Excel with SUMIF() Function, but i need to do this with R 我认为使用SUMIF()函数可以轻松在Excel中完成相同操作,但是我需要使用R
It feels like you didn't complete the if_else
: 感觉您没有完成
if_else
:
dat <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
Col1 = rep(c("B","S","S","B"), 4),
Col2 = rep(c(1,2,3,4), 4),
Col3 = rep(c(0.1,0.2,0.3,0.4), 4))
d1 <- dat %>%
group_by(Col0) %>%
mutate(Col4 = if_else(Col1 == 'B', Col2,
if_else(Col1 == 'S' & lag(Col1) == "B", 1-(0.2*1),
if_else(Col1 == 'S' & lag(Col1) == 'S',1.26,0))))
d1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.