简体   繁体   中英

How can I cumulatively add rows of a data frame until a certain number, then start over at next row?

I am trying to identify records in a data frame that are within 30 minutes of the first record in a 30-minute period so that they can be removed (ie a 30-minute quiet period), and then a new 30-minute period starts with the next record. Each record has a time and date column. I have successfully found the difference in time for each record (they are in order of time already), so I need to add up the differences in time cumulatively until they reach 30 minutes, and then begin adding again in the next row up to 30 minutes, and so on. How can I cumulatively add the rows of a data frame (time differences) in a new column, starting over each time it reached 30 minutes?

This is what the data looks like. ('difftime' is time since last record in seconds, so 1800 s = 30 min)

Species   Time                   difftime
DEER      2019-06-12 00:20:00     NA
DEER      2019-06-12 03:41:00     12060
DEER      2019-06-12 04:30:00     2940
DEER      2019-06-12 04:58:00     1680
DEER      2019-06-12 05:08:00     600
DEER      2019-06-12 05:10:00     120
DEER      2019-06-12 05:21:00     660
DEER      2019-06-12 05:23:00     120
DEER      2019-06-12 05:44:00     1260
DEER      2019-06-12 05:45:00     60

I need to add a new column that cumulatively adds up 'difftime' until it is greater than 1800, at which points it starts back over at 0. So it would look like this ('qp' = quiet period)

Species   Time                   difftime    qp
DEER      2019-06-12 00:20:00     NA         0
DEER      2019-06-12 03:41:00     12060      0
DEER      2019-06-12 04:30:00     2940       0
DEER      2019-06-12 04:58:00     1680       1680
DEER      2019-06-12 05:08:00     600        0
DEER      2019-06-12 05:10:00     120        120
DEER      2019-06-12 05:21:00     660        720
DEER      2019-06-12 05:23:00     120        840
DEER      2019-06-12 05:44:00     1260       0
DEER      2019-06-12 05:45:00     60         60

Each record where qp=0 signals the beginning of a 30-minute quiet period, meaning all records in the next 30 minutes should be disregarded. At this point, I would keep all records where qp = 0, and remove those where qp > 0 because they fall within the 30-minute quiet period of an initial record.

So, how can I create the 'qp' column?

Reduce(function(x, y) ifelse(x + y > 1800, 0, x + y),
       replace(d$difftime, is.na(d$difftime), 0),
       accumulate = TRUE)
# [1]    0    0    0 1680    0  120  780  900    0   60

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM