简体   繁体   中英

Perform Operations on the same day using XTS in R

I have what is hopefully a straightforward question. I have an xts object somewhat similar to the following:

                        | MarketPrice  |
----------------------------------------
2007-05-04 10:15:33.546 |   5.32       |
----------------------------------------
2007-05-04 10:16:42.100 |   5.31       |
----------------------------------------
2007-05-04 10:17:27.546 |   NA         |
----------------------------------------
2007-05-04 10:20:50.871 |   5.35       |
----------------------------------------
2007-05-04 10:21:38.652 |   5.37       |

Basically, I would like to find the MarketPrice index immediately before a time while also ommitting NA values. Let's say for instance we start at the time 2007-05-04 10:20:50.871 which has an index of 4 in the object. So this means that the Market Price immediately before this time is 5.31 which has an index of 2 in the object. In order to perform this task I have written up a function similar to the following:

 MPFunction <- function(t,df){

 ind <- t
 while(t>1){
     t=t-1
     if ( (index(df[t]) != index(df[ind])) && !(is.na(df[t,"MarketPrice"])))  {

    return(t)
   }
}
}

And this performs the task since the first condition in the IF statement checks to make sure the times in the index of the xts object are different and the second condition checks to make sure there is no NA value in the MarketPrice column.

However, I now run into an issue when I look at several days. Let's say I now have an xts object as follows:

                          | MarketPrice  |
  ----------------------------------------
  2007-05-03 16:59:58.921 |   5.32       |
  ----------------------------------------
  2007-05-04 10:12:27.546 |   NA         |
  ----------------------------------------
  2007-05-04 10:20:50.871 |   5.35       |
  ----------------------------------------

If I start at index 3 (ie at the time 2007-05-04 10:20:50.871 ) then if I wish to find the first index before this time that doesn't have an NA value in the MarketPrice column, it will go to index 1 which is 2007-05-03 16:59:58.921 . The problem however is that this is on a different day, and I want to make sure that I only extract the index of MarketPrice values on the same day.

Basically, I was wondering if there is a quick modification I can make to my MPFunction above in the IF statement which will allow me to avoid finding the index of the MarketPrice from the previous day. Also, I do not wish to split the xts object up by day, since it would complicate things quite a bit if I did.

Now, I already have several idea on how to solve this (such as using the strptime function to check dates etc.) but these are all time-consuming methods, so I was hoping to find a method which is much much faster, so if anyone has any ideas I'd appreciate it. Thanks in advance.

Sounds like you actually want to use split.xts (why is using split a complication? It shouldn't be, even with large amounts of tick data in each day), and recombine the results:

zz=xts(order.by = as.POSIXct(c("2007-05-03 09:59:58.921", 
                               "2007-05-03 10:03:58.921",
                               "2007-05-03 12:03:58.921"
                  "2007-05-04 10:15:33.546",
                 "2007-05-04 10:16:42.100",
                 "2007-05-04 10:17:27.546",
                 "2007-05-04 10:20:50.871",
                 "2007-05-04 10:21:38.652")),
  x = c(3, 4, 9,  5.32, 5.31, NA, 5.35, 5.37), dimnames = list(NULL, "MarketPrice"))

> zz
#                      MarketPrice
# 2007-05-03 09:59:58        3.00
# 2007-05-03 10:03:58        4.00
# 2007-05-04 10:15:33        5.32
# 2007-05-04 10:16:42        5.31
# 2007-05-04 10:17:27          NA
# 2007-05-04 10:20:50        5.35
# 2007-05-04 10:21:38        5.37


MPFunction <- function(x, time_window = "T10/T10:16:40") {
  #last(x[time_window, which.i= TRUE])   # get the index?
  # last returns the last row in the group selected:
  #last(x[time_window,])
  u <- x[time_window, which.i = TRUE]

  if (length(u) > 0) {
    # Get index which is not an NA value:
    u.na <- which(is.na(x[time_window, "MarketPrice"]))
    u2 <- u[!u %in% u.na]
    if (length(u2) > 0) {
      v <- xts(order.by = end(x[last(u2)]), x = last(u2), dimnames = list(NULL, "index.i"))        
    } else {
      v <- NULL      
    }
  } else {
    v <- NULL
  }
  v
}

# use T0/ as the start of the time window in each day for getting the index value by default. You can change this though.
chosen_window = "T0/T10:17:29"

by_day <- lapply(split(zz, f = "day"), FUN = MPFunction, time_window = chosen_window)

rr <- do.call(rbind, by_day)

> rr
#                     index.i
# 2007-05-03 10:03:58       2
# 2007-05-04 10:16:42       2

If there are no values in a day in the time_window of interest, you will get NULL for that day, and nothing returned in the output ( rr ) for that day

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM