简体   繁体   English

在 R 中格式化日期(非标准格式)

[英]Formatting Dates in R (Non-standard format)

Not new to R or formatting dates in R and wouldn't be asking this question but I am having seriously strange behavior and in the last 2 hours am no closer to resolving it.对 R 或在 R 中格式化日期并不陌生,也不会问这个问题,但我有非常奇怪的行为,在过去的 2 小时内还没有解决它。

I have a dataset which I have imported and want to format the date/time column using as.POSIXct .我有一个已导入的数据集,并希望使用as.POSIXct格式化日期/时间列。 The date is a non-standard format and I've applied what I know to be the proper formatting.日期是一种非标准格式,我已经应用了我所知道的正确格式。 Here is a small part of the data that I am having trouble with.这是我遇到问题的一小部分数据。 Code just after.代码紧随其后。 Problem is that there are 4 NA's starting at "2015-03-08 02:00:00 PST" .问题是有 4 个 NA 从"2015-03-08 02:00:00 PST" What gives?是什么赋予了? This seems completely random as it happens no where else in any of the other 55K observations.这似乎是完全随机的,因为它在其他任何 55K 观察中都没有发生。

bad.Dates<-c("3/7/2015 14:15", "3/7/2015 14:30", "3/7/2015 14:45", "3/7/2015 15:00", 
         "3/7/2015 15:15", "3/7/2015 15:30", "3/7/2015 15:45", "3/7/2015 16:00", 
         "3/7/2015 16:15", "3/7/2015 16:30", "3/7/2015 16:45", "3/7/2015 17:00", 
         "3/7/2015 17:15", "3/7/2015 17:30", "3/7/2015 17:45", "3/7/2015 18:00", 
         "3/7/2015 18:15", "3/7/2015 18:30", "3/7/2015 18:45", "3/7/2015 19:00", 
         "3/7/2015 19:15", "3/7/2015 19:30", "3/7/2015 19:45", "3/7/2015 20:00", 
         "3/7/2015 20:15", "3/7/2015 20:30", "3/7/2015 20:45", "3/7/2015 21:00", 
         "3/7/2015 21:15", "3/7/2015 21:30", "3/7/2015 21:45", "3/7/2015 22:00", 
         "3/7/2015 22:15", "3/7/2015 22:30", "3/7/2015 22:45", "3/7/2015 23:00", 
         "3/7/2015 23:15", "3/7/2015 23:30", "3/7/2015 23:45", "3/8/2015 0:00", 
         "3/8/2015 0:15", "3/8/2015 0:30", "3/8/2015 0:45", "3/8/2015 1:00", 
         "3/8/2015 1:15", "3/8/2015 1:30", "3/8/2015 1:45", "3/8/2015 2:00", 
         "3/8/2015 2:15", "3/8/2015 2:30", "3/8/2015 2:45", "3/8/2015 3:00", 
         "3/8/2015 3:15", "3/8/2015 3:30", "3/8/2015 3:45", "3/8/2015 4:00", 
         "3/8/2015 4:15", "3/8/2015 4:30", "3/8/2015 4:45", "3/8/2015 5:00", 
         "3/8/2015 5:15", "3/8/2015 5:30", "3/8/2015 5:45", "3/8/2015 6:00", 
         "3/8/2015 6:15", "3/8/2015 6:30", "3/8/2015 6:45", "3/8/2015 7:00", 
         "3/8/2015 7:15", "3/8/2015 7:30", "3/8/2015 7:45", "3/8/2015 8:00", 
         "3/8/2015 8:15", "3/8/2015 8:30", "3/8/2015 8:45", "3/8/2015 9:00", 
         "3/8/2015 9:15", "3/8/2015 9:30", "3/8/2015 9:45", "3/8/2015 10:00", 
         "3/8/2015 10:15", "3/8/2015 10:30", "3/8/2015 10:45", "3/8/2015 11:00", 
         "3/8/2015 11:15", "3/8/2015 11:30", "3/8/2015 11:45", "3/8/2015 12:00", 
         "3/8/2015 12:15", "3/8/2015 12:30", "3/8/2015 12:45", "3/8/2015 13:00", 
         "3/8/2015 13:15", "3/8/2015 13:30", "3/8/2015 13:45", "3/8/2015 14:00", 
         "3/8/2015 14:15", "3/8/2015 14:30", "3/8/2015 14:45", "3/8/2015 15:00", 
         "3/8/2015 15:15") 

as.POSIXct(strptime(bad.Dates,"%m/%d/%Y %H:%M"))

To make this example reproducible/solvable regardless of location, specify the timezones via tz= explicitly:为了使这个示例无论位置如何都可重现/可解决,请通过tz=明确指定时区:

bad.Dates <- c("3/8/2015 1:45", "3/8/2015 2:00", "3/8/2015 2:15",
               "3/8/2015 2:30", "3/8/2015 2:45", "3/8/2015 3:00")
as.POSIXct(bad.Dates, format="%m/%d/%Y %H:%M", tz="US/Pacific")

#[1] "2015-03-08 01:45:00 PST"
#[2] NA                       
#[3] NA                       
#[4] NA                       
#[5] NA                       
#[6] "2015-03-08 03:00:00 PDT"

You get NA s because those times don't exist in the modern-day timekeeping of the US Pacific region.你得到NA是因为这些时间在美国太平洋地区的现代计时中不存在。

Most of the United States, Canada, and Mexico's northern border cities will begin Daylight Saving Time (DST) on Sunday, March 8, 2015. People in areas that observe DST will spring forward one hour from 2am (02:00) to 3am (03:00), local time.美国、加拿大和墨西哥北部边境城市的大部分地区将于 2015 年 3 月 8 日星期日开始实行夏令时 (DST)。遵守夏令时的地区的人们将从凌晨 2 点 (02:00) 提前一小时(02:00)到凌晨 3 点( 03:00),当地时间。
Source: http://www.timeanddate.com/news/time/usa-canada-start-dst-2015.html来源: http : //www.timeanddate.com/news/time/usa-canada-start-dst-2015.html

Specifying a timezone like "UTC" that doesn't observe daylight savings will get around this issue.指定不遵守夏令时的"UTC"等时区将解决此问题。

as.POSIXct(bad.Dates, format="%m/%d/%Y %H:%M", tz="UTC")
#[1] "2015-03-08 01:45:00 UTC"
#[2] "2015-03-08 02:00:00 UTC"
#[3] "2015-03-08 02:15:00 UTC"
#[4] "2015-03-08 02:30:00 UTC"
#[5] "2015-03-08 02:45:00 UTC"
#[6] "2015-03-08 03:00:00 UTC"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM