在NA值上使用strptime

Question

I need to use the strptime function to convert timestamps which look like the following: 我需要使用strptime函数来转换时间戳，如下所示：

Tue Feb 11 12:18:36 +0000 2014
Tue Feb 11 12:23:22 +0000 2014
Tue Feb 11 12:26:26 +0000 2014
Tue Feb 11 12:28:02 +0000 2014

As required, I have copied this into a csv file and read it into R: 根据需要，我已将此文件复制到一个csv文件中并将其读取到R中：

timestamp_data <- read.table('timestamp_data.csv')

I then tried to convert it to recognized times using: 然后，我尝试使用以下方法将其转换为公认的时间：

timestamp_data_formatted <- strptime(timestamp_data[,1], format ="%a %b %d %H:%M:%S %z %Y")

I still get NA values when I try and view the formatted data in R. I think the problem is that when I view my imported csv data in R, instead of showing '+0000' it simply shows 0. How can I fix this? 尝试查看R中的格式化数据时，我仍然得到NA值。我认为问题是，当我在R中查看导入的csv数据时，它没有显示'+0000'，而只是显示0。如何解决此问题？

Answer 1

You're using read.table , not read.csv . 您正在使用read.table ，而不是read.csv 。 The former splits on whitespace and thus is splitting the datetimes into multiple columns: 前者在空白处分割，因此将日期时间分割为多列：

df <- read.table(text = 'Tue Feb 11 12:18:36 +0000 2014
Tue Feb 11 12:23:22 +0000 2014
Tue Feb 11 12:26:26 +0000 2014
Tue Feb 11 12:28:02 +0000 2014')

df
#>    V1  V2 V3       V4 V5   V6
#> 1 Tue Feb 11 12:18:36  0 2014
#> 2 Tue Feb 11 12:23:22  0 2014
#> 3 Tue Feb 11 12:26:26  0 2014
#> 4 Tue Feb 11 12:28:02  0 2014

str(df)
#> 'data.frame':    4 obs. of  6 variables:
#>  $ V1: Factor w/ 1 level "Tue": 1 1 1 1
#>  $ V2: Factor w/ 1 level "Feb": 1 1 1 1
#>  $ V3: int  11 11 11 11
#>  $ V4: Factor w/ 4 levels "12:18:36","12:23:22",..: 1 2 3 4
#>  $ V5: int  0 0 0 0
#>  $ V6: int  2014 2014 2014 2014

If you use read.csv (with sensible arguments), it works: 如果您使用read.csv （具有合理的参数），则可以使用：

df <- read.csv(text = 'Tue Feb 11 12:18:36 +0000 2014
Tue Feb 11 12:23:22 +0000 2014
Tue Feb 11 12:26:26 +0000 2014
Tue Feb 11 12:28:02 +0000 2014', header = FALSE, stringsAsFactors = FALSE)

df$datetime <- as.POSIXct(df$V1, format = '%a %b %d %H:%M:%S %z %Y', tz = 'UTC')

df
#>                               V1            datetime
#> 1 Tue Feb 11 12:18:36 +0000 2014 2014-02-11 12:18:36
#> 2 Tue Feb 11 12:23:22 +0000 2014 2014-02-11 12:23:22
#> 3 Tue Feb 11 12:26:26 +0000 2014 2014-02-11 12:26:26
#> 4 Tue Feb 11 12:28:02 +0000 2014 2014-02-11 12:28:02

str(df)
#> 'data.frame':    4 obs. of  2 variables:
#>  $ V1      : chr  "Tue Feb 11 12:18:36 +0000 2014" "Tue Feb 11 12:23:22 +0000 2014" "Tue Feb 11 12:26:26 +0000 2014" "Tue Feb 11 12:28:02 +0000 2014"
#>  $ datetime: POSIXct, format: "2014-02-11 12:18:36" "2014-02-11 12:23:22" ...

I'm using as.POSIXct here instead of strptime because the former is usually what you'll need, but strptime works now, too. 我在这里使用as.POSIXct而不是strptime因为前者通常是您所需要的，但是strptime现在也可以使用。

Answer 2

I find the lubridate package makes date handling a lot easier and read_csv from readr / tidyverse doesn't set factors automatically. 我发现lubridate软件包使日期处理变得容易read_csv ，而readr / tidyverse不会自动设置因素。

library(lubridate)
library(tidyverse)

timestamp_data <- read_csv('timestamp_data.csv', col_names = FALSE)
timestamp_data$parsed_date <- parse_date_time(timestamp_data$X1, "%a %b %d %H:%M:%S %z %Y")

在NA值上使用strptime

问题描述

2 个解决方案

解决方案1
0 已采纳 2019-01-13 02:40:42

解决方案2
0 2019-01-13 15:45:24

在NA值上使用strptime

问题描述

2 个解决方案

解决方案1 0 已采纳 2019-01-13 02:40:42

解决方案2 0 2019-01-13 15:45:24

解决方案1
0 已采纳 2019-01-13 02:40:42

解决方案2
0 2019-01-13 15:45:24