简体   繁体   中英

How to merge the time data in 24 hour format in R?

I have a data frame as following:

v2        v3
10:37:38  adakjl
10:38:02  sdjfisaofj
11:11:57  asdhad
12:42:02  asjla

I'd like to extract another data frame, which merge the rows whose time values are in a same hour and counts the number of entries like that:

v2                  v3
10:00:00-11:00:00   2
11:00:00-12:00:00   1
12:00:00-13:00:00   1
....

I wonder how to do it? I have searched the zoo documents but just found the methods of merging data in the same year or quarterly.

Thanks in advance.

You could do

df <- read.table(header=T, text="v2        v3
10:37:38  adakjl
10:38:02  sdjfisaofj
11:11:57  asdhad
12:42:02  asjla")
tab <- as.data.frame(table(strptime(df$v2, "%H:%M:%S")$hour), stringsAsFactors = F)
tab[, 1] <- sprintf("%02d:00:00-%02d:00:00", as.integer(tab[, 1]), as.integer(tab[, 1])+1)
tab
#                Var1 Freq
# 1 10:00:00-11:00:00    2
# 2 11:00:00-12:00:00    1
# 3 12:00:00-13:00:00    1

This is quite straightforward using dplyr .

## sample data
dat <- data.frame(time = c("10:37:38", "10:38:02", "11:11:57", "12:42:02"), 
                  value = c("adakjl", "sdjfisaofj", "asdhad", "asjla"))

## count hourly observations
library(dplyr)

dat %>%
  mutate(time = substr(time, 1, 2)) %>%
  count(time) %>%
  mutate(time = as.integer(time), 
         time = paste0(time, ":00:00-", time+1, ":00:00"))

And here is the consle output.

Source: local data frame [3 x 2]

               time     n
              (chr) (int)
1 10:00:00-11:00:00     2
2 11:00:00-12:00:00     1
3 12:00:00-13:00:00     1

This solution uses the zoo package.

1) Create a function, toInveral which given a time produces the corresponding time interval. Then have zoo read it in using that function to convert v2 and using aggregate = length to perform the count. Omit the fortify.zoo statement if you prefer to leave it as a zoo object.

library(zoo)

toInterval <- function(x) {
  hr <- as.POSIXct(x, format = "%H:%M:%S")
  h00 <- "%H:00:00"
  paste(format(hr, h00), format(hr + 3600, h00), sep = "-")
}
z <- read.zoo(DF, header = TRUE, FUN = toInterval, aggregate = length)
fortify.zoo(z)

giving:

              Index z
1 10:00:00-11:00:00 2
2 11:00:00-12:00:00 1
3 12:00:00-13:00:00 1

2) Here is a variation that might be preferable if you want to manipulate it later. It makes use of the "times" class in chron like this (or omit the + 1/24 to use the starting rather than ending time):

library(chron)
toHour <- function(x) trunc(times(x), "hour") + 1/24
z2 <- read.zoo(DF, header = TRUE, FUN = toHour, aggregate = length)
fortify.zoo(z2)

giving:

     Index z2
1 11:00:00  2
2 12:00:00  1
3 13:00:00  1

Note: We used this data.frame as input:

Lines <- "v2        v3
10:37:38  adakjl
10:38:02  sdjfisaofj
11:11:57  asdhad
12:42:02  asjla"
DF <- read.table(text = Lines, header = TRUE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM