[英]How do I split my time data into intervals in R?
我有一些看起來像這樣的數據:
time author text day times timeblock dayblock
2019-08-02 12:16:40|"ab5c9c0a"|"This message was deleted" |2| "12:16:40"| "Cycle 1"| "No"
2019-08-02 12:36:40|"ab5c9c0a"|"Please take a survey" |2| "12:36:40"| "Cycle 1"| "No"
2019-08-02 13:29:40|"43cd8b94"|"Done :D" |2| "13:29:40"| "Cycle 1"| "No"
2019-08-02 17:41:40|"083fa508"|"<Media omitted>" |2| "17:41:40"| "Cycle 1"| "No"
str(chat)
Classes ‘data.table’ and 'data.frame': 16111 obs. of 7 variables:
$ time : POSIXct, format: "2019-08-02 12:16:40" "2019-08-02 12:35:40" "2019-08-02 12:36:40" ...
$ author : chr "ab5c9c0a" "ab5c9c0a" "ab5c9c0a" "43cd8b94" ...
$ text : chr "This message was deleted" "https://docs.google.com/forms/d/e/1FAIpQLSf4hE" "Please take a survey" "Done :D" ...
$ day : int 2 2 2 2 2 3 3 3 3 3 ...
$ times : chr "12:16:40" "12:35:40" "12:36:40" "13:29:40" ...
$ timeblock: Factor w/ 13 levels "Cycle 1","Cycle 2",..:
我寫這個是為了將時間分類為7 am
, 10 pm
等等:
chat <- chat %>%
mutate(
# Time Segements
dayblock = case_when(
time >= hms(070000) & time <= hms(080000) ~ "7 AM",
time >= hms(080000) & time <= hms(090000) ~ "8 AM",
time >= hms(090000) & time <= hms(100000) ~ "9 AM",
time >= hms(100000) & time <= hms(110000) ~ "10 AM",
time >= hms(110000) & time <= hms(120000) ~ "11 AM",
time >= hms(120000) & time <= hms(130000) ~ "12 PM",
time >= hms(130000) & time <= hms(140000) ~ "1 PM",
time >= hms(140000) & time <= hms(150000) ~ "2 PM",
time >= hms(150000) & time <= hms(160000) ~ "3 PM",
time >= hms(160000) & time <= hms(170000) ~ "4 PM",
time >= hms(170000) & time <= hms(180000) ~ "5 PM",
time >= hms(180000) & time <= hms(190000) ~ "6 PM",
time >= hms(190000) & time <= hms(200000) ~ "7 PM",
time >= hms(200000) & time <= hms(210000) ~ "8 PM",
time >= hms(210000) & time <= hms(220000) ~ "9 PM",
time >= hms(220000) & time <= hms(230000) ~ "10 PM",
time >= hms(230000) & time <= hms(000000) ~ "11 PM",
time >= hms(000000) & time <= hms(010000) ~ "12 AM",
time >= hms(010000) & time <= hms(020000) ~ "1 AM",
time >= hms(020000) & time <= hms(030000) ~ "2 AM",
time >= hms(030000) & time <= hms(040000) ~ "3 AM",
time >= hms(040000) & time <= hms(050000) ~ "4 AM",
time >= hms(050000) & time <= hms(060000) ~ "5 AM",
time >= hms(060000) & time <= hms(070000) ~ "6 AM",
T ~ "No")) %>%
mutate(dayblock = factor(dayblock))
預期的 output 為:
time author text day times timeblock dayblock
2019-08-02 12:16:40|"ab5c9c0a"|"This message was deleted" |2| "12:16:40"| "Cycle 1"| 12 PM
2019-08-02 12:36:40|"ab5c9c0a"|"Please take a survey" |2| "12:36:40"| "Cycle 1"| 12 PM
2019-08-02 13:29:40|"43cd8b94"|"Done :D" |2| "13:29:40"| "Cycle 1"| 1 PM
2019-08-02 17:41:40|"083fa508"|"<Media omitted>" |2| "17:41:40"| "Cycle 1"| 5 PM
但是當我運行它時,所有行都只填充了No
值。 我究竟做錯了什么?
當前的錯誤是:
Problem with `mutate()` input `dayblock`.
i Some strings failed to parse, or all strings are NAs
i Input `dayblock` is `case_when(...)`.Some strings failed to parse, or all strings are NAsProblem with `mutate()` input `dayblock`.
編輯:雖然接受的答案解決了這個問題,但@Istrel 的答案是一個更優雅的解決方案,我建議用戶嘗試一下。
似乎您可以使用 function format
實現相同的效果。
library(tidyverse)
library(lubridate)
chat <- tibble(time = ymd_hms(c("2019-08-02 12:16:40", "2019-08-02 12:36:40",
"2019-08-02 13:29:40", "2019-08-02 3:29:40")))
chat <- chat %>%
mutate(dayblock = format(time, "%I %p"))
# time dayblock
# <dttm> <chr>
# 1 2019-08-02 12:16:40 12 PM
# 2 2019-08-02 12:36:40 12 PM
# 3 2019-08-02 13:29:40 01 PM
# 4 2019-08-02 03:29:40 03 AM
# 5 2019-08-02 02:01:40 02 AM
選項是更改進入hms
的格式
library(dplyr)
library(lubridate)
chat %>%
mutate(times = hms(times),
dayblock = factor(case_when(
times >= hms('07:00:00') & times <= hms('08:00:00') ~ "7 AM",
times >= hms('08:00:00') & times <= hms('09:00:00') ~ "8 AM",
times >= hms('12:00:00') & times <= hms('13:00:00') ~ "12 PM",
TRUE ~ "No"))
)
-輸出
# time author text day times timeblock dayblock
#1 2019-08-02 12:16:40 ab5c9c0a This message was deleted 2 12H 16M 40S Cycle 1 12 PM
#2 2019-08-02 12:36:40 ab5c9c0a Please take a survey 2 12H 36M 40S Cycle 1 12 PM
#3 2019-08-02 13:29:40 43cd8b94 Done :D 2 13H 29M 40S Cycle 1 No
#4 2019-08-02 17:41:40 083fa508 <Media omitted> 2 17H 41M 40S Cycle 1 No
chat <- structure(list(time = c("2019-08-02 12:16:40", "2019-08-02 12:36:40",
"2019-08-02 13:29:40", "2019-08-02 17:41:40"), author = c("ab5c9c0a",
"ab5c9c0a", "43cd8b94", "083fa508"), text = c("This message was deleted",
"Please take a survey", "Done :D", "<Media omitted>"), day = c(2L,
2L, 2L, 2L), times = c("12:16:40", "12:36:40", "13:29:40", "17:41:40"
), timeblock = c("Cycle 1", "Cycle 1", "Cycle 1", "Cycle 1"),
dayblock = c("No", "No", "No", "No")), class = "data.frame",
row.names = c(NA,
-4L))
我們也可以使用來自R base
的strftime
chat <- tibble(time = c("2019-08-02 12:16:40", "2019-08-02 12:36:40",
"2019-08-02 13:29:40", "2019-08-02 17:41:40"))
chat$dayblock <- strftime(chat$time, "%I %p")
time dayblock
<chr> <chr>
1 2019-08-02 12:16:40 12 PM
2 2019-08-02 12:36:40 12 PM
3 2019-08-02 13:29:40 01 PM
4 2019-08-02 17:41:40 05 PM
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.