简体   繁体   English

strptime、as.POSIXct 和 as.Date 返回意外的 NA

[英]strptime, as.POSIXct and as.Date return unexpected NA

When I try to parse a timestamp in the following format: "Thu Nov 8 15:41:45 2012", only NA is returned.当我尝试按以下格式解析时间戳时:“Thu Nov 8 15:41:45 2012”,只返回NA

I am using Mac OS X, R 2.15.2 and Rstudio 0.97.237.我使用的是 Mac OS X、R 2.15.2 和 Rstudio 0.97.237。 The language of my OS is Dutch: I presume this has something to do with it.我的操作系统的语言是荷兰语:我认为这与它有关。

When I try strptime , NA is returned:当我尝试strptime ,返回NA

var <- "Thu Nov 8 15:41:45 2012"
strptime(var, "%a %b %d %H:%M:%S %Y")
# [1] NA

Neither does as.POSIXct work: as.POSIXct不起作用:

as.POSIXct(var, "%a %b %d %H:%M:%S %Y")
# [1] NA

I also tried as.Date on the string above but without %H:%M:%S components:我也在上面的字符串上尝试了as.Date但没有%H:%M:%S组件:

as.Date("Thu Nov 8 2012", "%a %b %d %Y")
# [1] NA

Any ideas what I could be doing wrong?任何想法我可能做错了什么?

I think it is exactly as you guessed, strptime fails to parse your date-time string because of your locales.我认为正如您所猜测的那样,由于您的语言环境, strptime无法解析您的日期时间字符串。 Your string contains both abbreviated weekday ( %a ) and abbreviated month name ( %b ).您的字符串包含缩写的工作日 ( %a ) 和缩写的月份名称 ( %b )。 These time specifications are described in ?strptime :这些时间规范在?strptime中描述:

Details细节

%a : Abbreviated weekday name in the current locale on this platform %a :此平台上当前语言环境中的缩写工作日名称

%b : Abbreviated month name in the current locale on this platform . %b此平台上当前语言环境中的缩写月份名称。

"Note that abbreviated names are platform-specific (although the standards specify that in the C locale they must be the first three letters of the capitalized English name:" “请注意,缩写名称是特定于平台的(尽管标准规定在C语言环境中它们必须是大写英文名称的前三个字母:”

"Knowing what the abbreviations are is essential if you wish to use %a , %b or %h as part of an input format: see the examples for how to check." “如果您希望使用%a%b%h作为输入格式的一部分,了解缩写是必不可少的:请参阅示例以了解如何检查。”

See also也可以看看

[...] locales to query or set a locale. [...] locales来查询或设置区域设置。

The issue of locales is relevant also for as.POSIXct , as.POSIXlt and as.Date . locales的问题也与as.POSIXctas.POSIXltas.Date

From ?as.POSIXct :?as.POSIXct

Details细节

If format is specified, remember that some of the format specifications are locale-specific, and you may need to set the LC_TIME category appropriately via Sys.setlocale .如果format指定,请记住某些格式规范的特定语言环境,你可能需要设置LC_TIME通过适当的类别Sys.setlocale This most often affects the use of %b , %B (month names) and %p (AM/PM).这通常会影响%b%B (月份名称)和%p (AM/PM)的使用。

From ?as.Date :?as.Date

Details细节

Locale-specific conversions to and from character strings are used where appropriate and available.在适当和可用的情况下使用特定于语言环境的字符串转换。 This affects the names of the days and months.这会影响日期和月份的名称。


Thus, if weekdays and month names in the string differ from those in the current locale, strptime , as.POSIXct and as.Date fail to parse the string correctly and NA is returned.因此,如果字符串中的工作日和月份名称与当前语言环境中的不同,则strptimeas.POSIXctas.Date无法正确解析字符串并返回NA

However, you may solve this issue by changing the locales :但是,您可以通过更改locales来解决此问题:

# First save your current locale
loc <- Sys.getlocale("LC_TIME")

# Set correct locale for the strings to be parsed
# (in this particular case: English)
# so that weekdays (e.g "Thu") and abbreviated month (e.g "Nov") are recognized
Sys.setlocale("LC_TIME", "en_GB.UTF-8")
# or
Sys.setlocale("LC_TIME", "C") 

#Then proceed as you intended
x <- "Thu Nov 8 15:41:45 2012" 
strptime(x, "%a %b %d %H:%M:%S %Y")
# [1] "2012-11-08 15:41:45"

# Then set back to your old locale
Sys.setlocale("LC_TIME", loc) 

With my personal locale I can reproduce your error:使用我的个人语言环境,我可以重现您的错误:

Sys.setlocale("LC_TIME", loc)
# [1] "fr_FR.UTF-8"

strptime(var,"%a %b %d %H:%M:%S %Y")
# [1] NA

Was just messing around with same problem, and found this solution to be much cleaner because there is no need to change any of system settings manually, because there is a wrapper function doing this job in the lubridate package, and all you have to do is set the argument locale :只是在处理同样的问题,发现这个解决方案更干净,因为不需要手动更改任何系统设置,因为在lubridate包中有一个包装函数在做这个工作,你所要做的就是设置参数locale

date <- c("23. juni 2014", "1. november 2014", "8. marts 2014", "16. juni 2014", "12. december 2014", "13. august 2014")
df$date <- dmy(df$Date, locale = "Danish")
[1] "2014-06-23" "2014-11-01" "2014-03-08" "2014-06-16" "2014-12-12" "2014-08-13"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM