简体   繁体   English

as.Date(as.POSIXct()) 给出了错误的日期?

[英]as.Date(as.POSIXct()) gives the wrong date?

I'd been trying to look through a dataframe extracting all rows where the date component of a POSIXct column matched a certain value.I came across the following which is confusing me mightily:: as.Date(as.POSIXct(...)) doesn't always return the correct date.我一直在尝试查看一个数据框,提取所有行,其中 POSIXct 列的日期组件与某个值匹配。我遇到了以下内容,这让我非常困惑:: as.Date(as.POSIXct(...))并不总是返回正确的日期。

> dt <- as.POSIXct('2012-08-06 09:35:23')
[1] "2012-08-06 09:35:23 EST"
> as.Date(dt)
[1] "2012-08-05"

Why is the date of '2012-08- 06 09:35:23' equal to '2012-08- 05 ?为什么是“2012-08- 06 9时35分23秒”的时间等于“2012-08- 05?

I suspect it's something to do with different timezones being used, so noting that the timezone of dt was 'EST' I gave this to as.Date ::我怀疑这与使用的不同时区有关,所以注意到dt的时区是“EST”,我把它给了as.Date ::

> as.Date(as.POSIXct('2012-08-06 09:35:23'), tz='EST')
[1] "2012-08-05"

But it still returns 2012-08-05.但它仍然返回 2012-08-05。

Why is this?为什么是这样? How can I find all datetimes in my dataframe that were on the date 2012-08-06?如何在我的数据框中找到 2012-08-06 日期的所有日期时间? (as subset(my.df, as.character(as.Date(datetime), tz='EST') == '2012-08-06') does not return the row with datetime dt even though this did occur on the date 2012-08-06...)? (as subset(my.df, as.character(as.Date(datetime), tz='EST') == '2012-08-06')不返回带有 datetime dt的行,即使这确实发生在日期 2012-08-06...)?

Added details: Linux 64bit (though can reproduce on 32bit), can get this on both R 3.0.1 & 3.0.0, and I am currently AEST (Australian Eastern Standard Time)添加详细信息:Linux 64 位(尽管可以在 32 位上重现),可以在 R 3.0.1 和 3.0.0 上获得此功能,我目前是 AEST(澳大利亚东部标准时间)

The safe way to do this is to pass the date value through format .执行此操作的安全方法是通过format传递日期值。 This does create an additional step but as.Date will accept the character result if it is formated with a "-" or "/":这确实创建了一个额外的步骤,但as.Date将接受使用“-”或“/”格式化的字符结果:

as.Date( format( as.POSIXct('2019-03-11 23:59:59'), "%Y-%m-%d") )
[1] "2019-03-11"

as.Date(  as.POSIXct('2019-03-11 23:59:59') ) # I'm in a locale where the problem might exist
[1] "2019-03-12"

The documentation for timezones is confusing to me too.时区的文档也让我感到困惑。 In some (and this case as it turned out) case EST may not be unambiguous and may actually refer to a tz in Australia.在某些(和事实证明的这种情况下)情况下,EST 可能不是明确的,实际上可能指的是澳大利亚的 tz。 Try "EST5EDT" or "America/New_York" if you happen to be in North America.如果您碰巧在北美,请尝试“EST5EDT”或“America/New_York”。

In this case it could also relate to differences in how your unstated OS handles the 'tz' argument, since I get "2012-08-06".在这种情况下,它也可能与您未声明的操作系统处理“tz”参数的方式的差异有关,因为我得到了“2012-08-06”。 ( I'm in PDT US tz at the moment, although I'm not sure that should matter. )Changing which function gets the tz argument may clarify (or not): (我目前在 PDT US tz,虽然我不确定这是否重要。)更改获取 tz 参数的函数可能会澄清(或不澄清):

> as.Date(as.POSIXct('2012-08-06 19:35:23', tz='EST'))
[1] "2012-08-07"
> as.Date(as.POSIXct('2012-08-06 17:35:23', tz='EST'))
[1] "2012-08-06"


> as.Date(as.POSIXct('2012-08-06 21:35:23'), tz='EST')
[1] "2012-08-06"
> as.Date(as.POSIXct('2012-08-06 22:35:23'), tz='EST')
[1] "2012-08-07"

If you omit the tz from as.POSIXct then UTC is assumed.如果从as.POSIXct省略 tz,则假定为 UTC。

These are the unambiguous names of the Ozzie TZ's (at least on my Mac):这些是 Ozzie TZ 的明确名称(至少在我的 Mac 上):

tzfile <- "/usr/share/zoneinfo/zone.tab"
tzones <- read.delim(tzfile, row.names = NULL, header = FALSE,
    col.names = c("country", "coords", "name", "comments"),
    as.is = TRUE, fill = TRUE, comment.char = "#")
grep("^Aus", tzones$name, value=TRUE)
 [1] "Australia/Lord_Howe"   "Australia/Hobart"     
 [3] "Australia/Currie"      "Australia/Melbourne"  
 [5] "Australia/Sydney"      "Australia/Broken_Hill"
 [7] "Australia/Brisbane"    "Australia/Lindeman"   
 [9] "Australia/Adelaide"    "Australia/Darwin"     
[11] "Australia/Perth"       "Australia/Eucla" 

Fellow Australian chiming in here (Brisbane location, Win7 Enterprise 64 bit, R3.0.1):澳大利亚同胞在这里(布里斯班位置,Win7 Enterprise 64 位,R3.0.1):

I can replicate your issue:我可以复制你的问题:

> dt <- as.POSIXct('2012-08-06 09:35:23')
> dt
[1] "2012-08-06 09:35:23 EST"
> as.Date(dt)
[1] "2012-08-05"

Since as.Date defaults to UTC (GMT) as listed in ?as.Date :由于as.Date默认为UTC (GMT),如?as.Date

## S3 method for class 'POSIXct'
as.Date(x, tz = "UTC", ...) 

Forcing the POSIXct representation to UTC then works as expected:POSIXct表示强制为 UTC,然后按预期工作:

> dt <- as.POSIXct('2012-08-06 09:35:23',tz="UTC")
> as.Date(dt)
[1] "2012-08-06"

Alternatively, matching them both to my local tz works fine too:或者,将它们都匹配到我的本地tz可以正常工作:

> dt <- as.POSIXct('2012-08-06 09:35:23',tz="Australia/Brisbane")
> as.Date(dt,tz="Australia/Brisbane")
[1] "2012-08-06"

Edit: Ambiguity with the EST specification seems to be an issue for me:编辑: EST规范的歧义对我来说似乎是一个问题:

Default use of as.POSIXct默认使用as.POSIXct

> dt.def <- as.POSIXct("2012-01-01 22:00:00")
> dt.def
[1] "2012-01-01 22:00:00 EST"
> as.numeric(dt.def)
[1] 1325419200
> 

Ambiguous EST - should be the same as default Ambiguous EST - 应与默认值相同

> dt.est <- as.POSIXct("2012-01-01 22:00:00",tz="EST")
> dt.est
[1] "2012-01-01 22:00:00 EST"
> as.numeric(dt.est)
[1] 1325473200
> 

Unambiguous Brisbane, Australia timezone明确的澳大利亚布里斯班时区

> dt.bris <- as.POSIXct("2012-01-01 22:00:00",tz="Australia/Brisbane")
> dt.bris
[1] "2012-01-01 22:00:00 EST"
> as.numeric(dt.bris )
[1] 1325419200
> 

Differences差异

> dt.est - dt.def
Time difference of 15 hours
> dt.est - dt.bris
Time difference of 15 hours
> dt.bris - dt.def
Time difference of 0 secs

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM