繁体   English   中英

使用`aggregate()`求平均时,R timeSeries中不考虑第一个数据点; 如何正确使用该功能?

[英]First data point not considered in R timeSeries when averaging using `aggregate()`; how to correctly employ the function?

我想为NordPool市场的小时电价建立每日平均值。 我正在使用timeSeries包中的aggregate()方法从该每小时数据中构造每日均值,我已将其转换为timeSeries对象。 这是前72小时的dput()

    > dput(tstSeries)
    new("timeSeries"
    , .Data = structure(c(31.05, 30.47, 28.92, 27.88, 26.96, 27.84, 28.79, 
28.63, 28.44, 28.3, 30.65, 31.55, 32.16, 32.45, 32.63, 33.65, 
34.9, 36.22, 36.65, 36.37, 35.49, 34.41, 34.66, 32.55, 33.15, 
32.66, 31.83, 31.47, 32.56, 34.36, 36.28, 38.39, 39.09, 38.33, 
38.42, 38.25, 37.96, 37.89, 37.88, 38.78, 39.83, 39.91, 39.32, 
38.49, 37.46, 36.94, 36.37, 34.59, 33.11, 32.22, 31.46, 31.67, 
32.05, 33.67, 34.93, 35.82, 36.38, 36.52, 36.71, 36.6, 36.51, 
36.4, 36.42, 36.58, 36.94, 36.94, 36.81, 36.43, 35.91, 35.45, 
34.77, 32.09), .Dim = c(72L, 1L), .Dimnames = list(NULL, "TS.1"))
    , units = "TS.1"
    , positions = c(1356998400, 1357002000, 1357005600, 1357009200, 1357012800, 
1357016400, 1357020000, 1357023600, 1357027200, 1357030800, 1357034400, 
1357038000, 1357041600, 1357045200, 1357048800, 1357052400, 1357056000, 
1357059600, 1357063200, 1357066800, 1357070400, 1357074000, 1357077600, 
1357081200, 1357084800, 1357088400, 1357092000, 1357095600, 1357099200, 
1357102800, 1357106400, 1357110000, 1357113600, 1357117200, 1357120800, 
1357124400, 1357128000, 1357131600, 1357135200, 1357138800, 1357142400, 
1357146000, 1357149600, 1357153200, 1357156800, 1357160400, 1357164000, 
1357167600, 1357171200, 1357174800, 1357178400, 1357182000, 1357185600, 
1357189200, 1357192800, 1357196400, 1357200000, 1357203600, 1357207200, 
1357210800, 1357214400, 1357218000, 1357221600, 1357225200, 1357228800, 
1357232400, 1357236000, 1357239600, 1357243200, 1357246800, 1357250400, 
1357254000)
    , format = "%Y-%m-%d %H:%M:%S"
    , FinCenter = "GMT"
    , recordIDs = structure(list(), .Names = character(0), row.names = integer(0), class = "data.frame")
    , title = "Time Series Object"
    , documentation = "Wed May 20 11:02:09 2015"
)

要进行平均,请执行以下操作:

## daily averaging
bydaily = timeSequence(from = start(tstSeries), to = end(tstSeries), by = "day")
tstSeries.daily = aggregate(tstSeries, by = bydaily, FUN = mean)  

我得到的输出是:

tstSeries.daily

>GMT
TS.1
2013-01-01 31.05000
2013-01-02 31.82167
2013-01-03 36.67375  

在这里,第一个每日平均值是原始数据点! 我在Excel中执行了相同的计算,并确认在平均操作中未考虑第一个数据点,而是将2013-01-02的平均值计算为2013-01-01 01:00到2013- 01-02 00:00。

我已经看到了几个示例演示了如何使用aggregate()但没有发现任何引发此问题的示例。 有人看到过这种情况吗?有没有解决方法?

这是一个返回所需输出的解决方案。 这取决于PerformanceAnalytics包中的apply.rolling函数。

tstSeries.daily<-apply.rolling(tstSeries,width=24,by=24, FUN="mean") # get the mean of each of the 24 hours intervals.
tstSeries.daily<-tstSeries.daily[complete.cases(tstSeries.daily),] # remove rows with NAs.
rownames(tstSeries.daily)<-as.Date(rownames(tstSeries.daily)) # remove the time part of the index.
print(tstSeries.daily)
GMT 
              calcs
2013-01-01 31.73417
2013-01-02 36.67542
2013-01-03 35.09958

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM