简体   繁体   中英

Formatting Dates in R (Non-standard format)

Not new to R or formatting dates in R and wouldn't be asking this question but I am having seriously strange behavior and in the last 2 hours am no closer to resolving it.

I have a dataset which I have imported and want to format the date/time column using as.POSIXct . The date is a non-standard format and I've applied what I know to be the proper formatting. Here is a small part of the data that I am having trouble with. Code just after. Problem is that there are 4 NA's starting at "2015-03-08 02:00:00 PST" . What gives? This seems completely random as it happens no where else in any of the other 55K observations.

bad.Dates<-c("3/7/2015 14:15", "3/7/2015 14:30", "3/7/2015 14:45", "3/7/2015 15:00", 
         "3/7/2015 15:15", "3/7/2015 15:30", "3/7/2015 15:45", "3/7/2015 16:00", 
         "3/7/2015 16:15", "3/7/2015 16:30", "3/7/2015 16:45", "3/7/2015 17:00", 
         "3/7/2015 17:15", "3/7/2015 17:30", "3/7/2015 17:45", "3/7/2015 18:00", 
         "3/7/2015 18:15", "3/7/2015 18:30", "3/7/2015 18:45", "3/7/2015 19:00", 
         "3/7/2015 19:15", "3/7/2015 19:30", "3/7/2015 19:45", "3/7/2015 20:00", 
         "3/7/2015 20:15", "3/7/2015 20:30", "3/7/2015 20:45", "3/7/2015 21:00", 
         "3/7/2015 21:15", "3/7/2015 21:30", "3/7/2015 21:45", "3/7/2015 22:00", 
         "3/7/2015 22:15", "3/7/2015 22:30", "3/7/2015 22:45", "3/7/2015 23:00", 
         "3/7/2015 23:15", "3/7/2015 23:30", "3/7/2015 23:45", "3/8/2015 0:00", 
         "3/8/2015 0:15", "3/8/2015 0:30", "3/8/2015 0:45", "3/8/2015 1:00", 
         "3/8/2015 1:15", "3/8/2015 1:30", "3/8/2015 1:45", "3/8/2015 2:00", 
         "3/8/2015 2:15", "3/8/2015 2:30", "3/8/2015 2:45", "3/8/2015 3:00", 
         "3/8/2015 3:15", "3/8/2015 3:30", "3/8/2015 3:45", "3/8/2015 4:00", 
         "3/8/2015 4:15", "3/8/2015 4:30", "3/8/2015 4:45", "3/8/2015 5:00", 
         "3/8/2015 5:15", "3/8/2015 5:30", "3/8/2015 5:45", "3/8/2015 6:00", 
         "3/8/2015 6:15", "3/8/2015 6:30", "3/8/2015 6:45", "3/8/2015 7:00", 
         "3/8/2015 7:15", "3/8/2015 7:30", "3/8/2015 7:45", "3/8/2015 8:00", 
         "3/8/2015 8:15", "3/8/2015 8:30", "3/8/2015 8:45", "3/8/2015 9:00", 
         "3/8/2015 9:15", "3/8/2015 9:30", "3/8/2015 9:45", "3/8/2015 10:00", 
         "3/8/2015 10:15", "3/8/2015 10:30", "3/8/2015 10:45", "3/8/2015 11:00", 
         "3/8/2015 11:15", "3/8/2015 11:30", "3/8/2015 11:45", "3/8/2015 12:00", 
         "3/8/2015 12:15", "3/8/2015 12:30", "3/8/2015 12:45", "3/8/2015 13:00", 
         "3/8/2015 13:15", "3/8/2015 13:30", "3/8/2015 13:45", "3/8/2015 14:00", 
         "3/8/2015 14:15", "3/8/2015 14:30", "3/8/2015 14:45", "3/8/2015 15:00", 
         "3/8/2015 15:15") 

as.POSIXct(strptime(bad.Dates,"%m/%d/%Y %H:%M"))

To make this example reproducible/solvable regardless of location, specify the timezones via tz= explicitly:

bad.Dates <- c("3/8/2015 1:45", "3/8/2015 2:00", "3/8/2015 2:15",
               "3/8/2015 2:30", "3/8/2015 2:45", "3/8/2015 3:00")
as.POSIXct(bad.Dates, format="%m/%d/%Y %H:%M", tz="US/Pacific")

#[1] "2015-03-08 01:45:00 PST"
#[2] NA                       
#[3] NA                       
#[4] NA                       
#[5] NA                       
#[6] "2015-03-08 03:00:00 PDT"

You get NA s because those times don't exist in the modern-day timekeeping of the US Pacific region.

Most of the United States, Canada, and Mexico's northern border cities will begin Daylight Saving Time (DST) on Sunday, March 8, 2015. People in areas that observe DST will spring forward one hour from 2am (02:00) to 3am (03:00), local time.
Source: http://www.timeanddate.com/news/time/usa-canada-start-dst-2015.html

Specifying a timezone like "UTC" that doesn't observe daylight savings will get around this issue.

as.POSIXct(bad.Dates, format="%m/%d/%Y %H:%M", tz="UTC")
#[1] "2015-03-08 01:45:00 UTC"
#[2] "2015-03-08 02:00:00 UTC"
#[3] "2015-03-08 02:15:00 UTC"
#[4] "2015-03-08 02:30:00 UTC"
#[5] "2015-03-08 02:45:00 UTC"
#[6] "2015-03-08 03:00:00 UTC"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM