简体   繁体   中英

Sum Blocks of Positive Values in R

I have a large data set, 150k rows, ~11 MB in size. Each row contains an hourly measure of profit, which can be positive, negative, or zero. I am trying to calculate a new variable equal to the profit of each positive "block." Hopefully this is self-explanatory in the data set below.

"Profit" is the input variable. I can get the next two columns but can't solve for "profit_block" . Any help would be much appreciated!

dat <- data.frame(profit = c(20, 10, 5, 10, -20, -100, -40, 500, 27, -20),
                  indic_pos = c( 1, 1, 1, 1, 0, 0, 0, 1, 1, 0),
                  cum_profit = c(20, 30, 35, 45, 0, 0, 0, 500, 527, 0),
                  profit_block = c(45, 45, 45, 45, 0, 0, 0, 527, 527, 0))

   profit indic_pos cum_profit profit_block
1      20         1         20           45
2      10         1         30           45
3       5         1         35           45
4      10         1         45           45
5     -20         0          0            0
6    -100         0          0            0
7     -40         0          0            0
8     500         1        500          527
9      27         1        527          527
10    -20         0          0            0

I've found the following post below very helpful, but I can't quite conform it to my need here. Thanks again.

Related URL: Assigning a value to each range of consecutive numbers with same sign in R

We can use rleid to create a group based on the sign of the column ie same adjacent sign elements will be a single group and then get the max of the 'cum_profit'

library(dplyr)
dat %>% 
    group_by(grp = rleid(sign(profit))) %>% 
     mutate(profit_block2 = max(cum_profit)) %>%
     ungroup %>%
     select(-grp)

-output

# A tibble: 10 x 5
#   profit indic_pos cum_profit profit_block profit_block2
#    <dbl>     <dbl>      <dbl>        <dbl>         <dbl>
# 1     20         1         20           45            45
# 2     10         1         30           45            45
# 3      5         1         35           45            45
# 4     10         1         45           45            45
# 5    -20         0          0            0             0
# 6   -100         0          0            0             0
# 7    -40         0          0            0             0
# 8    500         1        500          527           527
# 9     27         1        527          527           527
#10    -20         0          0            0             0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM