简体   繁体   English

R中同一行的条件累积总和

[英]Conditional Cumulative Sum over the same row in R

I have a dataset like this 我有一个像这样的数据集

dat <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
              Col1 = rep(c("B","S","S","B"), 4),
              Col2 = rep(c(1,2,3,4), 4),
              Col3 = rep(c(0.1,0.2,0.3,0.4), 4))

I'm trying to create a fourth column as shown below 我正在尝试创建第四列,如下所示

dat1 <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
               Col1 = rep(c("B","S","S","B"), 4),
               Col2 = rep(c(1,2,3,4), 4),
               Col3 = rep(c(0.1,0.2,0.3,0.4), 4),
               Col4 = c(1, 0.8, 1.26, 4, 1, 0.8, 1.26, 4, 1, 0.8, 1.26, 4))

What I have tried till now, 到现在为止我一直在尝试

d1 <- dat %>% 
  group_by(Col0) %>% 
  mutate(Col4 = if_else(Col1 == 'B', Col2,
                        if_else(Col1 == 'S' & lag(Col1 == "B"), lag(Col2)- Col3*lag(Col2), 0)))
d1

The Answer I'm getting is not what is in Col4, which is desired. 我得到的答案不是Col4中所需要的。 The condition for getting Col4 is : 获得Col4的条件是:

 if Col1 is B then get the value of Col2 as it is,

 if Col1 is S & Previous Value of Col1 is B then 1-(0.2*1) which is equal to 0.8
 if Col1 is S & Previous Value of Col1 is S as well then (1+0.8) -((1+0.8)*0.3) which is 1.26

Basically, it's like first performing difference and then performing cumulative sum including the difference and so on. 基本上,这就像先执行差异,然后执行包括该差异的累加总和,依此类推。

For now, I have taken a simple example to understand what I'm trying to achieve, the actual data-set has more than 1 million Obs. 现在,我仅举一个简单的例子来了解我要实现的目标,实际数据集已超过100万个Obs。 and Several Thousand Groups and what's worse is that the Combination of 'B' & 'S' alter. 还有数千个组,更糟糕的是“ B”和“ S”的组合发生了变化。 Like in some groups it's B,B,S,S and So on... 就像某些团体中的B,B,S,S等等...

Any Help on this will be appreciated as I have tried several things other than if_else() and seen many conditional cumulative sum Ques as well but to no avail. 由于我尝试了除if_else()以外的其他事情,并且看到许多条件累积总和Ques,但无济于事,因此对此有所帮助。

I think the same could be done easily in Excel with SUMIF() Function, but i need to do this with R 我认为使用SUMIF()函数可以轻松在Excel中完成相同操作,但是我需要使用R

It feels like you didn't complete the if_else : 感觉您没有完成if_else

dat <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
          Col1 = rep(c("B","S","S","B"), 4),
          Col2 = rep(c(1,2,3,4), 4),
          Col3 = rep(c(0.1,0.2,0.3,0.4), 4))
d1 <- dat %>% 
   group_by(Col0) %>% 
   mutate(Col4 = if_else(Col1 == 'B', Col2,
                    if_else(Col1 == 'S' & lag(Col1) == "B", 1-(0.2*1),
                            if_else(Col1 == 'S' & lag(Col1) == 'S',1.26,0))))
d1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM