简体   繁体   中英

Conversion of raw data to R date/time class

I've successfully scraped a bunch of tables that I now want to manipulate in R. I'm new to R, but this seems like a good way to figure it out.

I currently have data stored in a bar-delimited CSV that looks like about this:

FIRST|LAST|9812036311|string-string|1999-07-06 00:00:00|2000-07-06 00:00:00|12345|1999-07-27 00:00:00|2,518.50

I can read it in with:

j <- read.table('my_data.csv', header = FALSE, sep = "|")

but ... how do I transform those date columns into dates that R can read?

Do I need to define a frame first?

Checkout colclasses as recommended here to help R understand what type of data it should expect in each column of your CSV. Also look at the lubridate package for dealing with date formats.

To convert date-time strings like you have, i suggest using the structure function, like so:

> # begin with date-time values stored as character vectors ("strings"):

> w
   [1] "2000-07-06 00:00:00"

> typeof(w)
   [1] "character"

> class(w)
   [1] "character"

> # use structure to convert them
> w = structure(w, class=c("POSIXt, POSIXct"))

> # verify that they are indeed R-recognizable date-time objects:
> w
   [1] "2000-07-06 00:00:00"
   attr(,"class")
   [1] "POSIXt, POSIXct"

w/r/t the mechanics: R functions are vectorized, so you can just pass in a column of dates and bind the result to the same column, like so:

> j$day
    [1] 1353967580 1353967581 1353967583 1353967584 1353967585 1353967586 
    [7] 1353967587 1353967588 1353967589 1353967590

> j$day = structure(day, class=c("POSIXt", "POSIXct"))

> day
   [1] "2012-11-26 14:06:20 PST" "2012-11-26 14:06:21 PST" "2012-11-26 14:06:22 PST"
   [4] "2012-11-26 14:06:23 PST" "2012-11-26 14:06:24 PST" "2012-11-26 14:06:25 PST"
   [7] "2012-11-26 14:06:26 PST" "2012-11-26 14:06:28 PST" "2012-11-26 14:06:29 PST"
   [10] "2012-11-26 14:06:30 PST"

事实证明我只需要as.Date - 作为一个完整性检查,我创建了一个带有修改日期的新列。

j$better.date <- as.Date(j$original.date, format="%Y-%d-%m")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM