简体   繁体   中英

R : using sapply for date objects

I'm manipulating three date objects (class : "POSIXlt" "POSIXt") . The two first vectors ( start and end ) define start and end points of some intervals and the third vector ( inc ) corresponds to some incidents. What I want to detect is that, which incident has happened in which interval. I reduced the size of my vectors to to provide a working example. Otherwise, the real length of vectors is really large.

start <- c("2007-09-16 18:40:27 GMT","2007-09-28 23:53:55 GMT", "2007-10-25 05:23:01 GMT")
end <- c("2007-09-19 18:40:27 GMT", "2007-10-01 23:53:55 GMT","2007-10-28 05:23:01 GMT")
inc <- c("2007-09-17 18:45:00 GMT", "2007-09-17 19:00:00 GMT", "2007-09-17 19:15:00 GMT", "2007-09-17 19:30:00 GMT")

Here is the simple code to detect the corresponding dates :

quel.eve <- sapply( inc, function(s)
              which(start <= s & end >=s) )

When I use 'which(start <= “2007-09-17 18:45:00 GMT” & end >=2007-09-17 18:45:00 GMT)' it works properly and returns 1 . The problem arises only if i want to apply 'sapply'. it gives some strange results :

$sec
integer(0)

$min
integer(0)

$hour
integer(0)

$mday
integer(0)

$mon
integer(0)

$year
integer(0)

$wday
integer(0)

$yday
integer(0)

$isdst
integer(0)

In this question I found out that since the 'POSIXct' is already a list in its nature, 'sapply' cannot deal with it. The elements of vectors that are provided here are copied from my consol and that's why they resemble to characters. In my program they are definitely 'Date' objects. Is there a way, a part from converting them to POSIXct , to do so? Your help would be appreciated.

lubridate package can help with this. All do need to be converted into date/time objects, else the comparison will compare them as strings, ie "b" > "a" rather than as intervals. Below is a solution, I'm confused how your vector of start and end is to be used. In your example, each inc value will be greater than any minimum start; and for each inc less than any maximum end. So it's not clear if these are meant to be pairs somehow? Below assumes start as min(start) and end as max(end).

Meanwhile which() will return a null integer as you're receiving when no values match. This may also be related to how the start/end vectors are interacting: if an inc value greater than the first value and not others, it will return TRUE FALSE FALSE and then if it's less than an end value and returns FALSE FALSE TRUE, there will be no union of TRUE FALSE FALSE & FALSE FALSE TRUE so which will always return empty.

library(lubridate)
start <- c("2007-09-16 18:40:27 GMT","2007-09-28 23:53:55 GMT", "2007-10-25 05:23:01 GMT")
end <- c("2007-09-19 18:40:27 GMT", "2007-10-01 23:53:55 GMT","2007-10-28 05:23:01 GMT")
inc <- c("2007-09-17 18:45:00 GMT", "2007-09-17 19:00:00 GMT", "2007-09-17 19:15:00 GMT", "2007-09-17 19:30:00 GMT")

inc <- as_datetime(inc)
start <- min(as_date(start))
end <- max(as_datetime(end))

inc[which(inc >= start & inc <= end)]

> inc[which(inc >= start & inc <= end)]
[1] "2007-09-17 18:45:00 UTC" "2007-09-17 19:00:00 UTC" "2007-09-17 19:15:00 UTC" "2007-09-17 19:30:00 UTC"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM