简体   繁体   中英

Creating a new column using summary data in R

Data:

structure(list(datetime = structure(c(6L, 2L, 4L, 5L, 1L, 3L), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6"), .Label = c(" 2016-12-01 00:00:30", 
" 2016-12-01 00:02:17", " 2016-12-01 00:06:17", " 2016-12-01 00:28:10", 
" 2016-12-01 01:17:02", "2016-12-01 00:00:00"), class = "factor")), .Names = "datetime", row.names = c("V1", 
"V2", "V3", "V4", "V5", "V6"), class = "data.frame")

Code

library(lubridate)
library(dplyr)

data$datetime <- ymd_hms(data$datetime)
data <- dplyr::arrange(data, datetime)
data$hour <- cut.POSIXt(data$datetime, "hour")
data %>% group_by(hour) %>% summarize(count = n())

output A tibble: 2 x 2 hour count 1 2016-12-01 00:00:00 5 2 2016-12-01 01:00:00 1

Output in the original dataset DateTime Hour 2016-12-01 00:00:00 00 2016-12-01 00:00:01 00

Desired output

    DateTime     Hour   Count
               <fctr> <int>
1 2016-12-01   00:00:00     5
2 2016-12-01   01:00:00     1

I want to display the number of records in each hour and put those number into a new column called count. hope you guys understood my problem. Please help me guys..

The option would be to add separate into the %>%

library(tidyr)
res <- data %>%
         group_by(hour) %>%
         summarize(count = n()) %>%
         separate(hour, into = c('DateTime', 'Hour'), sep=' ')

The group_by/summarize can be changed to count

res <- count(data, hour) %>%
          separate(hour, into = c('DateTime', 'Hour'), sep=' ')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM