I have what is hopefully a straightforward question. I have an xts
object somewhat similar to the following:
| MarketPrice |
----------------------------------------
2007-05-04 10:15:33.546 | 5.32 |
----------------------------------------
2007-05-04 10:16:42.100 | 5.31 |
----------------------------------------
2007-05-04 10:17:27.546 | NA |
----------------------------------------
2007-05-04 10:20:50.871 | 5.35 |
----------------------------------------
2007-05-04 10:21:38.652 | 5.37 |
Basically, I would like to find the MarketPrice
index immediately before a time while also ommitting NA
values. Let's say for instance we start at the time 2007-05-04 10:20:50.871
which has an index of 4 in the object. So this means that the Market Price immediately before this time is 5.31
which has an index of 2 in the object. In order to perform this task I have written up a function similar to the following:
MPFunction <- function(t,df){
ind <- t
while(t>1){
t=t-1
if ( (index(df[t]) != index(df[ind])) && !(is.na(df[t,"MarketPrice"]))) {
return(t)
}
}
}
And this performs the task since the first condition in the IF statement checks to make sure the times in the index of the xts
object are different and the second condition checks to make sure there is no NA
value in the MarketPrice
column.
However, I now run into an issue when I look at several days. Let's say I now have an xts
object as follows:
| MarketPrice |
----------------------------------------
2007-05-03 16:59:58.921 | 5.32 |
----------------------------------------
2007-05-04 10:12:27.546 | NA |
----------------------------------------
2007-05-04 10:20:50.871 | 5.35 |
----------------------------------------
If I start at index 3 (ie at the time 2007-05-04 10:20:50.871
) then if I wish to find the first index before this time that doesn't have an NA
value in the MarketPrice
column, it will go to index 1 which is 2007-05-03 16:59:58.921
. The problem however is that this is on a different day, and I want to make sure that I only extract the index of MarketPrice
values on the same day.
Basically, I was wondering if there is a quick modification I can make to my MPFunction
above in the IF statement which will allow me to avoid finding the index of the MarketPrice from the previous day. Also, I do not wish to split the xts
object up by day, since it would complicate things quite a bit if I did.
Now, I already have several idea on how to solve this (such as using the strptime
function to check dates etc.) but these are all time-consuming methods, so I was hoping to find a method which is much much faster, so if anyone has any ideas I'd appreciate it. Thanks in advance.
Sounds like you actually want to use split.xts
(why is using split a complication? It shouldn't be, even with large amounts of tick data in each day), and recombine the results:
zz=xts(order.by = as.POSIXct(c("2007-05-03 09:59:58.921",
"2007-05-03 10:03:58.921",
"2007-05-03 12:03:58.921"
"2007-05-04 10:15:33.546",
"2007-05-04 10:16:42.100",
"2007-05-04 10:17:27.546",
"2007-05-04 10:20:50.871",
"2007-05-04 10:21:38.652")),
x = c(3, 4, 9, 5.32, 5.31, NA, 5.35, 5.37), dimnames = list(NULL, "MarketPrice"))
> zz
# MarketPrice
# 2007-05-03 09:59:58 3.00
# 2007-05-03 10:03:58 4.00
# 2007-05-04 10:15:33 5.32
# 2007-05-04 10:16:42 5.31
# 2007-05-04 10:17:27 NA
# 2007-05-04 10:20:50 5.35
# 2007-05-04 10:21:38 5.37
MPFunction <- function(x, time_window = "T10/T10:16:40") {
#last(x[time_window, which.i= TRUE]) # get the index?
# last returns the last row in the group selected:
#last(x[time_window,])
u <- x[time_window, which.i = TRUE]
if (length(u) > 0) {
# Get index which is not an NA value:
u.na <- which(is.na(x[time_window, "MarketPrice"]))
u2 <- u[!u %in% u.na]
if (length(u2) > 0) {
v <- xts(order.by = end(x[last(u2)]), x = last(u2), dimnames = list(NULL, "index.i"))
} else {
v <- NULL
}
} else {
v <- NULL
}
v
}
# use T0/ as the start of the time window in each day for getting the index value by default. You can change this though.
chosen_window = "T0/T10:17:29"
by_day <- lapply(split(zz, f = "day"), FUN = MPFunction, time_window = chosen_window)
rr <- do.call(rbind, by_day)
> rr
# index.i
# 2007-05-03 10:03:58 2
# 2007-05-04 10:16:42 2
If there are no values in a day in the time_window
of interest, you will get NULL
for that day, and nothing returned in the output ( rr
) for that day
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.