簡體   English   中英

從UTC時間戳-R中提取日期,小時和工作日

[英]To extract date, hour and weekday from a UTC timestamp -R

我有一個數據庫,其中有一個用於在UTC中創建時間的列,例如

created_utc
1 1430438400
2 1430438410
3 1430438430
4 1430438455
5 1430438470
6 1430438480

我想將日期,小時和is.weekend提取到單獨的列中。 我努力了,

db_subset %>% mutate(hour = as.POSIXlt(created_utc, origin ='1970-01-01')$hour)

但是它無法識別created_utc對象。 我嘗試將其強制為數據框,然后,

df_comments <- db_subset %>% 
                select(created_utc) %>%
                        collect() %>%
                            data.frame() %>%
                               mutate(hour = as.POSIXlt(created_utc, origin ='1970-01-01')$hour)

但失敗並顯示錯誤: invalid subscript type 'closure'

有人可以幫我扭到哪里,如何提取時間,日期等?

如果我們使用dplyr則一種選擇是轉換為POSIXct (因為不支持POSIXlt類)並使用lubridate提取hour

library(lubridate)
library(dplyr)
db_subset %>%
    mutate(hour=hour(as.POSIXct(created_utc, origin='1970-01-01')))
#   created_utc hour
#1  1430438400   20
#2  1430438410   20
#3  1430438430   20
#4  1430438455   20
#5  1430438470   20
#6  1430438480   20

數據

db_subset <- structure(list(created_utc = c(1430438400L, 1430438410L, 
1430438430L, 
1430438455L, 1430438470L, 1430438480L)), .Names = "created_utc", 
class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6"))

我首先建議將您的created_utc轉換為POSIXct類(而不是POSIXlt ,然后提取所需的所有數據。這是使用data.table包的簡單說明)

library(data.table)
setDT(df)[, created_utc := as.POSIXct(created_utc, origin = '1970-01-01')]
df[, `:=`(Date = as.Date(created_utc),
          Hour = hour(created_utc),
          isWeekend = wday(created_utc) %in% c(7L, 1L))]
df
#            created_utc       Date Hour isWeekend
# 1: 2015-05-01 03:00:00 2015-05-01    3     FALSE
# 2: 2015-05-01 03:00:10 2015-05-01    3     FALSE
# 3: 2015-05-01 03:00:30 2015-05-01    3     FALSE
# 4: 2015-05-01 03:00:55 2015-05-01    3     FALSE
# 5: 2015-05-01 03:01:10 2015-05-01    3     FALSE
# 6: 2015-05-01 03:01:20 2015-05-01    3     FALSE

所有這些都可以通過基數R來完成:

R> df <- data.frame(created_utc=c(1430438400, 1430438410, 1430438430,
+                                 1430438455, 1430438470, 1430438480))
R> df
  created_utc
1  1430438400
2  1430438410
3  1430438430
4  1430438455
5  1430438470
6  1430438480
R> 
R> # so far so good -- we just have the data
R> # so let's make it a date time object 
R> 
R> df[,1] <- as.POSIXct(df[,1], origin="1970-01-01")
R> df
          created_utc
1 2015-04-30 19:00:00
2 2015-04-30 19:00:10
3 2015-04-30 19:00:30
4 2015-04-30 19:00:55
5 2015-04-30 19:01:10
6 2015-04-30 19:01:20
R> 
R> ## we can use this to extract Date, Hour and Weekend computations
R> 
R> df[,"date"] <- as.Date(df[,1])
R> df[,"hour"] <- as.POSIXlt(df[,1])$hour
R> df[,"isWeekend"] <- as.POSIXlt(df[,1])$wday < 1 || as.POSIXlt(df[,1])$wday > 5
R> df
          created_utc       date hour isWeekend
1 2015-04-30 19:00:00 2015-05-01   19     FALSE
2 2015-04-30 19:00:10 2015-05-01   19     FALSE
3 2015-04-30 19:00:30 2015-05-01   19     FALSE
4 2015-04-30 19:00:55 2015-05-01   19     FALSE
5 2015-04-30 19:01:10 2015-05-01   19     FALSE
6 2015-04-30 19:01:20 2015-05-01   19     FALSE
R> 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM