[英]R apply.weekly() returns incorrect period when converting from daily to weekly time-series
I am working with time-series data and experiencing a problem with apply.weekly()
. 我正在使用时间序列数据,并遇到apply.weekly()
的问题。 It would appear that after a certain date, the weeks to not aggregate correctly. 似乎在某个日期之后,几周无法正确汇总。
library(xts)
value <- c(46.40269, 47.27100 ,47.73311, 46.12858, 44.54989 ,42.79287, 41.70017 ,41.22373, 40.16180, 38.48705 ,37.02111 ,35.95312, 37.47187, 42.59649 ,49.22880, 53.96820, 57.97346, 61.22755,61.79824, 65.05720, 65.30233 ,61.86191,58.03687, 55.17815, 52.88933, 51.47876, 50.31402, 48.91674, 47.47042)
DATE <- as.Date(c("2038-01-03", "2038-01-04", "2038-01-05", "2038-01-06", "2038-01-07" ,"2038-01-08", "2038-01-09", "2038-01-10", "2038-01-11", "2038-01-12", "2038-01-13" ,"2038-01-14", "2038-01-15" ,"2038-01-16" ,"2038-01-17", "2038-01-18", "2038-01-19", "2038-01-20", "2038-01-21", "2038-01-22", "2038-01-23", "2038-01-24" ,"2038-01-25", "2038-01-26", "2038-01-27", "2038-01-28", "2038-01-29", "2038-01-30", "2038-01-31"))
DF <- data.frame(DATE, value)
DF_daily <- xts(DF$value, order.by = DF$DATE)
DF_weekly <- apply.weekly(DF_daily, FUN=sum)
print(DF_weekly)
This generates the following output: 这将产生以下输出:
[,1]
2038-01-03 46.40269
2038-01-10 311.39935
2038-01-16 231.69144
2038-01-31 840.70198
Notice how the final period is 15 days long. 请注意,最后期限是15天。 Now, if I instead use dates from 2010, I get exactly what you'd expect. 现在,如果我改用2010年的日期,则可以完全满足您的期望。 That is, using 也就是说,使用
DATE <- as.Date(c("2010-01-03", "2010-01-04", "2010-01-05", "2010-01-06", "2010-01-07" ,"2010-01-08" ,"2010-01-09" ,"2010-01-10", "2010-01-11", "2010-01-12" ,"2010-01-13" ,"2010-01-14" ,"2010-01-15" ,"2010-01-16", "2010-01-17", "2010-01-18", "2010-01-19" ,"2010-01-20" ,"2010-01-21" ,"2010-01-22", "2010-01-23", "2010-01-24", "2010-01-25" ,"2010-01-26","2010-01-27" ,"2010-01-28" ,"2010-01-29" ,"2010-01-30", "2010-01-31"))
in the above code generates the output: 在上面的代码中生成输出:
[,1]
2010-01-03 46.40269
2010-01-10 311.39935
2010-01-17 280.92024
2010-01-24 427.18889
2010-01-31 364.28429
Is there something weird about the year 2038 I don't know about? 我不知道2038年是否有些奇怪?
I am running this code on 64-bit Windows 7 Enterprise, sessionInfo()
returns the following output 我在64位Windows 7 Enterprise上运行此代码, sessionInfo()
返回以下输出
R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] xts_0.9-7 zoo_1.7-12
loaded via a namespace (and not attached):
[1] tools_3.2.3 grid_3.2.3 lattice_0.20-33
January 19, 2038 is a special date: At 3:14:08 AM, a 32-bit unix epoch (counting the number of seconds since midnight, January 1, 1970) will overflow. 2038年1月19日是一个特殊的日期:凌晨3:14:08,一个32位的unix纪元(计算自1970年1月1日午夜以来的秒数)将溢出。 It possible that there is a bug in handling timestamps that causes a counter break at this date. 在处理时间戳时可能存在一个错误,导致此日期的计数器中断。 Many numbers are stored as signed 32-bit integers, which have a maximum value of 2,147,483,647. 许多数字存储为带符号的32位整数,最大值为2,147,483,647。
This is called the "Year 2038 Problem" , similar to the Y2K Problem. 这被称为“ 2038年问题” ,类似于Y2K问题。
However, the R Date
type is in the number of days , instead of the number of seconds , since the Unix Epoch. 但是, R Date
类型是从Unix Epoch开始的天数 ,而不是秒数 。 To me, this suggests that there is an issue with the xts
package. 对我来说,这表明xts
软件包存在问题。
You are not alone in this problem ( here is a 2012 discussion on a mailing list ), and it appears the bug is coming from a bad handoff between system date handling and R date handling. 在这个问题上您并不孤单( 这是有关邮件列表的2012年讨论 ),并且似乎该错误来自系统日期处理和R日期处理之间的错误传递。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.