简体   繁体   English

R日期解析使用read_excel函数

[英]R date parsing using read_excel function

When using the read_excel function the dates in the spreadsheet, in the column FuelEventDateTime, are in format "dd/mm/yyyy hr:mm:ss" (example: 03/05/2019 9:19:00 AM) is parsed as a character string with a format like this: example: 43588.849xxxxx (being x any number).使用 read_excel 函数时,电子表格中 FuelEventDateTime 列中的日期格式为“dd/mm/yyyy hr:mm:ss”(例如:03/05/2019 9:19:00 AM)被解析为具有如下格式的字符串:示例:43588.849xxxxxx(x 为任意数字)。 I cannot set this column to the correct date class, and I don't know what that number can mean but have seen it several times in Excel.我无法将此列设置为正确的日期类,我不知道该数字的含义,但在 Excel 中已多次看到它。

Tried to separate the "."试图将“。”分开。 in the character string, set the column as.numeric, and tried several functions in lubridate, R base and anydate library, as maybe that number is a date in epoch format in origin "1900-01-01"在字符串中,将列设置为.numeric,并尝试了 lubridate、R base 和 anydate 库中的几个函数,因为该数字可能是起源“1900-01-01”中的纪元格式的日期

Read data读取数据

sys_raw <- read_excel("Advanced Fill-Ups Report 15052019_165240.xlsx", sheet = "Data", col_names = FALSE) 

col_names_sys <- sys_raw[11,] 

sys_tidy <- sys_raw[12:ncol(sys_raw),] %>% 
  setNames(col_names_sys) %>% 
  select(DeviceName, FuelEventDateTime,FuelUsedEventDistance)

Noticed the character string as numbers, tried separate "."注意到字符串为数字,尝试单独的“。” and set as numeric并设置为数字

sys_tidy <- sys_tidy %>% 

  mutate(FuelEventDateTime = str_split(FuelEventDateTime, "\\.")) %>% 

  separate(FuelEventDateTime, c("c","date","time")) %>% 

  separate(DeviceName, c("Device"), sep = "\\s") %>% 

  select(Device, date, FuelUsedEventDistance) %>% 

  mutate(date = as.numeric(date)) 

sys_tidy <- sys_tidy %>% 

  as.Date(date, origin = "1900-01-01") 

Actual results of this are errors, the expected result is a column date with a date class in the format "dd/mm/yyyy", don't need time.实际结果是错误的,预期结果是日期类格式为“dd/mm/yyyy”的列日期,不需要时间。

Example of error messages:错误消息示例:

Error in as.Date.default(., date, origin = "1900-01-01") :do not know how to convert '.' as.Date.default(., date, origin = "1900-01-01") 中的错误:不知道如何转换 '.' to class “Date”上课“日期”

Error in as.POSIXct.default(., date, origin = "1900-01-01") :do not know how to convert '.' as.POSIXct.default(., date, origin = "1900-01-01") 中的错误:不知道如何转换 '.' to class “POSIXct”到类“POSIXct”

sys_tidy <- sys_tidy %>% 
   as.Date(date, origin = "1900-01-01") 

You probably mean你大概是说

sys_tidy <- sys_tidy %>% 
   mutate(date = as.Date(date, origin = "1900-01-01"))

Otherwise you are plugging a data frame into the first term of as.Date and R doesn't know what to do with that.否则,您会将数据框插入as.Date的第一项,而 R 不知道该怎么做。 From ?as.Date: The as.Date methods accept character strings, factors, logical NA and objects of classes "POSIXlt" and "POSIXct".来自 ?as.Date: as.Date 方法接受字符串、因子、逻辑 NA 和类“POSIXlt”和“POSIXct”的对象。

mutate , from dplyr , understands that you will be working with one or more columns within the data frame ( sys_tidy ) that was fed into it with the %>% pipe, and assigns the output to the column called date therein. mutate来自dplyr ,了解您将使用通过%>%管道输入的数据框 ( sys_tidy ) 中的一个或多个列,并将输出分配给其中称为date的列。

The base R equivalent would be similar, but would require that the input and the output both specify the context for the date column, which lives within the sys_tidy data frame.基本的 R 等效项类似,但需要输入和输出都指定date列的上下文,该列位于sys_tidy数据框中。

sys_tidy$date = as.Date(sys_tidy$date, origin = "1900-01-01"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM