简体   繁体   English

R数据表推荐的方式来处理日期时间

[英]R data table recommended way to deal with date time

I have a csv file with one column of timestamps "2000-01-01 12:00:00.123456". 我有一个带有一列时间戳“2000-01-01 12:00:00.123456”的csv文件。 What's the recommended way to dealing with it in data table? 在数据表中处理它的推荐方法是什么? I need to deal with grouping, matching/rolling join with IDate column from another table, time series plotting, etc. 我需要处理分组,匹配/滚动连接与另一个表中的IDate列,时间序列绘图等。

IDateTime("2000-01-01 12:00:00.123456")

Error in if (any(neg)) res[neg] = paste("-", res[neg], sep = "") :
missing value where TRUE/FALSE needed

I see this answer in the possible duplicate question in which Matthew suggested manually casting dates into integers. 我在可能的重复问题中看到了这个答案,其中Matthew建议手动将日期转换为整数。 But that's 3 years old and I wonder if there now exists a better way? 但那已经3岁了,我想知道现在是否有更好的方法?

IDateTime requires a POSIXct class object in order to work properly (it seems to work properly with a factor conversion too, not sure why). IDateTime需要一个POSIXct类对象才能正常工作(它似乎也适用于factor转换,不确定原因)。 I agree it isn't documented very well and maybe worth opening an FR/PR on GH regarding documentation- there is an open queue regarding an IDateTime vignette though. 我同意它没有很好地记录,并且可能值得在GH上打开FR / PR关于文档 - 但是有一个关于IDateTime 插图的开放队列。 And there is already an FR regarding allowing it to work with a character class. 并且已经存在关于允许它与character类一起工作的FR

IDateTime(as.POSIXct("2000-01-01 12:00:00.123456"))
#         idate    itime
# 1: 2000-01-01 12:00:00
## IDateTime(factor("2000-01-01 12:00:00.123456")) ## will also work

Pay attention to the tz parameter in as.POSIXct if you want to avoid unexpected behaviour 如果要避免意外行为,请注意as.POSIXcttz参数


Regardless, it seems like the error is actually caused by the print method of ITime which calls format.ITime , see here and here eg, if you will run res <- IDateTime("2015-09-29 08:22:00") this will not produce an error, though res will be NA due to wrong conversion (I believe) in here (the format is only "%H:%M:%OS" ). 无论如何,这似乎是错误实际上是由的打印方法引起ITime它调用format.ITime ,见这里这里例如,如果你将运行res <- IDateTime("2015-09-29 08:22:00")不会产生错误,虽然res会因为错误的转换(我相信)在这里NA (格式只是"%H:%M:%OS" )。 It seems like a bug to me and I still uncertain why factor class works correctly when there is no factor method in methods(as.ITime) . 对我来说这似乎是一个错误,当方法中没有factor methods(as.ITime)时,我仍然不确定为什么factor类正常工作。 Maybe due to its integer internal storage mode which calls another related method. 可能是由于其integer内部存储模式调用了另一种相关方法。

Depending on the precision required for your time fields you may need to use POSIXct instead of IDateTime . 根据时间字段所需的精度,您可能需要使用POSIXct而不是IDateTime
The timestamp format stored in your source file can be reproduced in R by format(Sys.time(), "%Y-%m-%d %H:%M:%OS6") . 存储在源文件中的时间戳格式可以按format(Sys.time(), "%Y-%m-%d %H:%M:%OS6")以R format(Sys.time(), "%Y-%m-%d %H:%M:%OS6")再现。
When using IDateTime you will lose the subseconds, you can play with ITime and see if it fits your need. 使用IDateTime您将丢失亚秒,您可以使用ITime并查看它是否符合您的需要。
If you will stick to POSIXct then you should be aware of ?setNumericRounding function which may be sometimes important as it affects ordering and joining on POSIXct 's underlying numeric data type. 如果您坚持使用POSIXct那么您应该注意?setNumericRounding函数,它有时很重要,因为它会影响POSIXct底层数值数据类型的排序和连接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM