I have a data frame as following:
v2 v3
10:37:38 adakjl
10:38:02 sdjfisaofj
11:11:57 asdhad
12:42:02 asjla
I'd like to extract another data frame, which merge the rows whose time values are in a same hour and counts the number of entries like that:
v2 v3
10:00:00-11:00:00 2
11:00:00-12:00:00 1
12:00:00-13:00:00 1
....
I wonder how to do it? I have searched the zoo documents but just found the methods of merging data in the same year or quarterly.
Thanks in advance.
You could do
df <- read.table(header=T, text="v2 v3
10:37:38 adakjl
10:38:02 sdjfisaofj
11:11:57 asdhad
12:42:02 asjla")
tab <- as.data.frame(table(strptime(df$v2, "%H:%M:%S")$hour), stringsAsFactors = F)
tab[, 1] <- sprintf("%02d:00:00-%02d:00:00", as.integer(tab[, 1]), as.integer(tab[, 1])+1)
tab
# Var1 Freq
# 1 10:00:00-11:00:00 2
# 2 11:00:00-12:00:00 1
# 3 12:00:00-13:00:00 1
This is quite straightforward using dplyr .
## sample data
dat <- data.frame(time = c("10:37:38", "10:38:02", "11:11:57", "12:42:02"),
value = c("adakjl", "sdjfisaofj", "asdhad", "asjla"))
## count hourly observations
library(dplyr)
dat %>%
mutate(time = substr(time, 1, 2)) %>%
count(time) %>%
mutate(time = as.integer(time),
time = paste0(time, ":00:00-", time+1, ":00:00"))
And here is the consle output.
Source: local data frame [3 x 2]
time n
(chr) (int)
1 10:00:00-11:00:00 2
2 11:00:00-12:00:00 1
3 12:00:00-13:00:00 1
This solution uses the zoo package.
1) Create a function, toInveral
which given a time produces the corresponding time interval. Then have zoo read it in using that function to convert v2
and using aggregate = length
to perform the count. Omit the fortify.zoo
statement if you prefer to leave it as a zoo object.
library(zoo)
toInterval <- function(x) {
hr <- as.POSIXct(x, format = "%H:%M:%S")
h00 <- "%H:00:00"
paste(format(hr, h00), format(hr + 3600, h00), sep = "-")
}
z <- read.zoo(DF, header = TRUE, FUN = toInterval, aggregate = length)
fortify.zoo(z)
giving:
Index z
1 10:00:00-11:00:00 2
2 11:00:00-12:00:00 1
3 12:00:00-13:00:00 1
2) Here is a variation that might be preferable if you want to manipulate it later. It makes use of the "times"
class in chron like this (or omit the + 1/24
to use the starting rather than ending time):
library(chron)
toHour <- function(x) trunc(times(x), "hour") + 1/24
z2 <- read.zoo(DF, header = TRUE, FUN = toHour, aggregate = length)
fortify.zoo(z2)
giving:
Index z2
1 11:00:00 2
2 12:00:00 1
3 13:00:00 1
Note: We used this data.frame as input:
Lines <- "v2 v3
10:37:38 adakjl
10:38:02 sdjfisaofj
11:11:57 asdhad
12:42:02 asjla"
DF <- read.table(text = Lines, header = TRUE)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.