簡體   English   中英

轉換為 POSIXct 時出錯:某些日期返回 NA

[英]Error when converting to POSIXct: some dates return NA

我有這個df:

df <- structure(list(date = structure(c(4L, 6L, 7L, 8L, 9L, 10L, 11L, 
                                      1L, 2L, 5L, 3L), .Label = c("2018-03-24 00:24:14", "2018-03-24 00:54:00", 
                                                                  "2018-03-24 12:19:00", "2018-03-24 14:04:01", "2018-03-24 17:12:35", 
                                                                  "2018-03-24 18:58:57", "2018-03-24 20:48:50", "2018-03-24 21:37:42", 
                                                                  "2018-03-25 01:55:40", "2018-03-25 02:47:58", "2018-03-25 03:35:11"
                                      ), class = "factor")), row.names = c(NA, -11L), class = "data.frame")

我想將日期轉換為 POSIXct:

df  <- df %>%
  mutate(date=as.POSIXct(date, format="%Y-%m-%d %H:%M:%OS"))

它似乎奏效了:

class(df$date)
> class(df$date)
[1] "POSIXct" "POSIXt" 

但是...如您所見,一個日期返回了一個 NA:

df
                  date
1  2018-03-24 14:04:01
2  2018-03-24 18:58:57
3  2018-03-24 20:48:50
4  2018-03-24 21:37:42
5  2018-03-25 01:55:40
6                 <NA>
7  2018-03-25 03:35:11
8  2018-03-24 00:24:14
9  2018-03-24 00:54:00
10 2018-03-24 17:12:35
11 2018-03-24 12:19:00

為什么?

Session 信息:

> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_Switzerland.1252  LC_CTYPE=English_Switzerland.1252   
[3] LC_MONETARY=English_Switzerland.1252 LC_NUMERIC=C                        
[5] LC_TIME=English_Switzerland.1252    

謝謝

正如@DirkEddelbuettel在評論中提到的那樣,這是一個夏令時問題。

df$date
# [1] "2018-03-24 14:04:01 CET" 
# [2] "2018-03-24 18:58:57 CET" 
# [3] "2018-03-24 20:48:50 CET" 
# [4] "2018-03-24 21:37:42 CET" 
# [5] "2018-03-25 01:55:40 CET" 
# [6] "2018-03-25 02:47:58"        ##
# [7] "2018-03-25 03:35:11 CEST"
# [8] "2018-03-24 00:24:14 CET" 
# [9] "2018-03-24 00:54:00 CET" 
# [10] "2018-03-24 17:12:35 CET" 
# [11] "2018-03-24 12:19:00 CET"

as.POSIXct似乎正確地拒絕第六次轉換,因為它可能根本不存在。

as.POSIXct("2018-03-25 02:47:58", format="%Y-%m-%d %H:%M:%S")
# [1] NA

如果您仍然想使用時間,您可以使用strptime

strptime("2018-03-25 02:47:58", format="%Y-%m-%d %H:%M:%S")
# [1] "2018-03-25 02:47:58"

整個東西:

df <- transform(df, date=strptime(df$date, format="%Y-%m-%d %H:%M:%S"))
df
#                   date
# 1  2018-03-24 14:04:01
# 2  2018-03-24 18:58:57
# 3  2018-03-24 20:48:50
# 4  2018-03-24 21:37:42
# 5  2018-03-25 01:55:40
# 6  2018-03-25 02:47:58
# 7  2018-03-25 03:35:11
# 8  2018-03-24 00:24:14
# 9  2018-03-24 00:54:00
# 10 2018-03-24 17:12:35
# 11 2018-03-24 12:19:00
str(df)
# 'data.frame': 11 obs. of  1 variable:
#  $ date: POSIXlt, format:  ...

dplyr也可以:

df %>% mutate(df, date=strptime(df$date, format="%Y-%m-%d %H:%M:%S"))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM