简体   繁体   中英

Adding Rows based on TimeSeries Data using R

Consider the following DataSet;

scd <- read.table(text = "
2019-04-01 10:00:00 | 2019-04-01 12:00:00 | 10
2019-04-02 10:00:00 | 2019-04-02 12:00:00 | 5
2019-04-03 13:00:00 | 2019-04-03 15:00:00 | 7
2019-04-04 16:00:00 | 2019-04-04 19:00:00 | 5
2019-04-05 10:00:00 | 2019-04-05 12:00:00 | 6
2019-04-06 10:00:00 | 2019-04-06 12:00:00 | 5", sep = "|")

colnames(scd) <- c('start_date_ts', 'end_date_ts', 'people_count')

The above code consists of start date and end date with time, with the assumption that for each hour, I can expect a count increase mentioned in the people count column.

For Example, take Row 1, it says that from 10 AM to 12PM, I can expect count to increase by 10.

2019-04-01 10:00:00 = 10 + Actual Data

2019-04-01 11:00:00 = 10 + Actual Data

2019-04-01 12:00:00 = 10 + Actual Data

Actual Data;

fc_data <- read.table(text = "
2019-04-01 10:00:00 | 10
2019-04-01 12:00:00 | 5
2019-04-04 19:00:00 | 5
2019-04-05 12:00:00 | 6
2019-04-06 08:00:00 | 3", sep = "|")

colnames(fc_data) <- c('pred_t', 'fpc')

I am expecting the following outcome; (from the fc_data)

Row 1 - 10 + 10 = 20

Row 2 - 5 + 10 = 15

Row 3 - 5 + 5 = 10

Row 4 - 6 + 6 = 12

Row 5 - 3 + 0 = 3

I want the code to run through each row and match with the start and end time and provide me with the output provided above.

My Approach;

fc_data$events_pc <- with(fc_data, ifelse(fc_data$pred_t == scd$start_date_ts | fc_data$pred_t == scd$end_date_ts &
                                        fc_data$pred_t == scd$end_date_ts,
                                      fc_data$fpc + scd$people_count, fc_data$fpc + 0))

Although, I get some of the rows added up, the others actually don't match up. I have searched the Stack for some information, but, I am unable to find any. Any inputs will be very helpful.

We can use mapply and match the start_date_ts and end_date_ts from scd with pred_t , get the corresponding people_count and add it to fpc .

mapply(function(x, y) {
   inds <- x >= scd$start_date_ts & x <= scd$end_date_ts
   if (any(inds))  
      y + scd$people_count[inds]
   else
      y
}, fc_data$pred_t, fc_data$fpc)

#[1] 20 15 10 12  3

Make sure the date-time variable are in POSIXct format, if they are not you need to change them.

fc_data$pred_t <- as.POSIXct(fc_data$pred_t)
scd[1:2] <- lapply(scd[1:2], as.POSIXct)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM