簡體   English   中英

使用`aggregate()`求平均時,R timeSeries中不考慮第一個數據點; 如何正確使用該功能?

[英]First data point not considered in R timeSeries when averaging using `aggregate()`; how to correctly employ the function?

我想為NordPool市場的小時電價建立每日平均值。 我正在使用timeSeries包中的aggregate()方法從該每小時數據中構造每日均值,我已將其轉換為timeSeries對象。 這是前72小時的dput()

    > dput(tstSeries)
    new("timeSeries"
    , .Data = structure(c(31.05, 30.47, 28.92, 27.88, 26.96, 27.84, 28.79, 
28.63, 28.44, 28.3, 30.65, 31.55, 32.16, 32.45, 32.63, 33.65, 
34.9, 36.22, 36.65, 36.37, 35.49, 34.41, 34.66, 32.55, 33.15, 
32.66, 31.83, 31.47, 32.56, 34.36, 36.28, 38.39, 39.09, 38.33, 
38.42, 38.25, 37.96, 37.89, 37.88, 38.78, 39.83, 39.91, 39.32, 
38.49, 37.46, 36.94, 36.37, 34.59, 33.11, 32.22, 31.46, 31.67, 
32.05, 33.67, 34.93, 35.82, 36.38, 36.52, 36.71, 36.6, 36.51, 
36.4, 36.42, 36.58, 36.94, 36.94, 36.81, 36.43, 35.91, 35.45, 
34.77, 32.09), .Dim = c(72L, 1L), .Dimnames = list(NULL, "TS.1"))
    , units = "TS.1"
    , positions = c(1356998400, 1357002000, 1357005600, 1357009200, 1357012800, 
1357016400, 1357020000, 1357023600, 1357027200, 1357030800, 1357034400, 
1357038000, 1357041600, 1357045200, 1357048800, 1357052400, 1357056000, 
1357059600, 1357063200, 1357066800, 1357070400, 1357074000, 1357077600, 
1357081200, 1357084800, 1357088400, 1357092000, 1357095600, 1357099200, 
1357102800, 1357106400, 1357110000, 1357113600, 1357117200, 1357120800, 
1357124400, 1357128000, 1357131600, 1357135200, 1357138800, 1357142400, 
1357146000, 1357149600, 1357153200, 1357156800, 1357160400, 1357164000, 
1357167600, 1357171200, 1357174800, 1357178400, 1357182000, 1357185600, 
1357189200, 1357192800, 1357196400, 1357200000, 1357203600, 1357207200, 
1357210800, 1357214400, 1357218000, 1357221600, 1357225200, 1357228800, 
1357232400, 1357236000, 1357239600, 1357243200, 1357246800, 1357250400, 
1357254000)
    , format = "%Y-%m-%d %H:%M:%S"
    , FinCenter = "GMT"
    , recordIDs = structure(list(), .Names = character(0), row.names = integer(0), class = "data.frame")
    , title = "Time Series Object"
    , documentation = "Wed May 20 11:02:09 2015"
)

要進行平均,請執行以下操作:

## daily averaging
bydaily = timeSequence(from = start(tstSeries), to = end(tstSeries), by = "day")
tstSeries.daily = aggregate(tstSeries, by = bydaily, FUN = mean)  

我得到的輸出是:

tstSeries.daily

>GMT
TS.1
2013-01-01 31.05000
2013-01-02 31.82167
2013-01-03 36.67375  

在這里,第一個每日平均值是原始數據點! 我在Excel中執行了相同的計算,並確認在平均操作中未考慮第一個數據點,而是將2013-01-02的平均值計算為2013-01-01 01:00到2013- 01-02 00:00。

我已經看到了幾個示例演示了如何使用aggregate()但沒有發現任何引發此問題的示例。 有人看到過這種情況嗎?有沒有解決方法?

這是一個返回所需輸出的解決方案。 這取決於PerformanceAnalytics包中的apply.rolling函數。

tstSeries.daily<-apply.rolling(tstSeries,width=24,by=24, FUN="mean") # get the mean of each of the 24 hours intervals.
tstSeries.daily<-tstSeries.daily[complete.cases(tstSeries.daily),] # remove rows with NAs.
rownames(tstSeries.daily)<-as.Date(rownames(tstSeries.daily)) # remove the time part of the index.
print(tstSeries.daily)
GMT 
              calcs
2013-01-01 31.73417
2013-01-02 36.67542
2013-01-03 35.09958

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM