简体   繁体   中英

How to extract data from a time series based on start and end dates from a different dataframe?

I am working with water quality data and I have a list of storm events I extracted from the streamflow time series.

head(Storms)
  PeakNumber            PeakTime PeakHeight       PeakStartTime         PeakEndTime DurationHours
1          1 2019-07-21 22:15:00   81.04667 2019-07-21 21:30:00 2019-07-22 04:45:00          7.25
2          2 2019-07-22 13:45:00   66.74048 2019-07-22 13:00:00 2019-07-22 23:45:00         10.75
3          3 2019-07-11 11:30:00   49.08663 2019-07-11 10:45:00 2019-07-11 19:00:00          8.25
4          4 2019-05-29 18:45:00   37.27926 2019-05-29 18:30:00 2019-05-29 20:45:00          2.25
5          5 2019-06-27 16:30:00   33.12268 2019-06-27 16:00:00 2019-06-27 17:15:00          1.25
6          6 2019-07-11 08:15:00   31.59931 2019-07-11 07:45:00 2019-07-11 09:00:00          1.25

I would like to use these PeakStartTime and PeakEndTime points to subset my other data. The other data is 15-minute time series data in xts or data.table format (I am constantly going back and forth for various functions/plots)

> head(Nitrogen)
                       [,1]
2019-03-20 10:00:00 2.12306
2019-03-20 10:15:00 2.13538
2019-03-20 10:30:00 2.14180
2019-03-20 10:45:00 2.14704
2019-03-20 11:00:00 2.14464
2019-03-20 11:15:00 2.15548

So I would like to create a new dataframe for each storm that is just the Nitrogen data between those PeakStartTime and PeakEndTime points. And then hopefully loop this, so it will do so for each of the peaks in the Storms dataframe.

One option is to do the comparison on each corresponding StartTime, EndTime, and subset the data

library(xts)
do.call(rbind, Map(function(x, y) Nitrogen[paste( x, y,  sep="/")], 
              Storms$PeakStartTime, Storms$PeakEndTime))
#                       [,1]
#2019-05-29 18:30:00 -0.07102752
#2019-05-29 18:45:00 -0.19454811
#2019-05-29 19:00:00 -1.69684540
#2019-05-29 19:15:00  1.09384970
#2019-05-29 19:30:00  0.20019572
#2019-05-29 19:45:00 -0.76086259
# ...

data

set.seed(24)
Nitrogen <- xts(rnorm(20000), order.by = seq(as.POSIXct('2019-03-20 10:00:00'),
       length.out = 20000, by = '15 min'))   


Storms <- structure(list(PeakNumber = 1:6, PeakTime = structure(c(1563761700, 
1563817500, 1562859000, 1559169900, 1561667400, 1562847300), class = c("POSIXct", 
"POSIXt"), tzone = ""), PeakHeight = c(81.04667, 66.74048, 49.08663, 
37.27926, 33.12268, 31.59931), PeakStartTime = structure(c(1563759000, 
1563814800, 1562856300, 1559169000, 1561665600, 1562845500), class = c("POSIXct", 
"POSIXt"), tzone = ""), PeakEndTime = structure(c(1563785100, 
1563853500, 1562886000, 1559177100, 1561670100, 1562850000), class = c("POSIXct", 
"POSIXt"), tzone = ""), DurationHours = c(7.25, 10.75, 8.25, 
2.25, 1.25, 1.25)), row.names = c("1", "2", "3", "4", "5", "6"
), class = "data.frame")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM