简体   繁体   中英

R fitting and forecasting daily time series

I am working with a daily time serie and I need to build a forecast for 90 days (or maybe more) based on my history - The current time serie has roughly 298 data points.

The issue I have is the famous flat line in the final forecast - and yes I might not have a seasonality but I am trying to work this out. Another issue is how to find the best model and adapt it from here on for this kind of behaviour.

I created a test case to investigate this further and any help is appreciated.

Thanks,

To start with

x <- day_data  # My time serie
z <- 90        # Days to forecast

low_bound_date <- as.POSIXlt(min(x$time), format = "%m/%d/%Y") # oldest date in the DF.

> low_bound_date
[1] "2015-12-21 PST"

low_bound_date$yday 
> low_bound_date$yday  # Day in Julian
[1] 354

lbyear <- as.numeric(substr(low_bound_date, 1, 4))
> lbyear
[1] 2015

This is my time serie content

> ts
Time Series:
Start = c(2065, 4) 
End = c(2107, 7) 
Frequency = 7 
  [2] 20.73 26.19 27.51 26.11 26.28 27.58 26.84 27.00 26.30 28.75 28.43 39.03 41.36 45.42 44.80 45.33 47.79 44.70 45.17
 [20] 34.90 32.54 32.75 33.35 34.76 34.11 33.59 33.60 38.08 30.45 29.66 31.09 31.36 31.96 29.30 30.04 30.85 31.13 25.09
 [39] 17.88 23.73 25.31 31.30 35.18 34.13 34.96 35.12 27.36 38.33 38.59 38.14 38.54 41.72 37.15 35.92 37.37 32.39 30.64
 [58] 30.57 30.66 31.16 31.50 30.68 32.21 32.27 32.55 33.61 34.80 33.53 33.09 20.90  6.91  7.82 15.78  7.25  6.19  6.38
 [77] 38.06 39.82 35.53 38.63 41.91 39.76 37.26 38.79 37.74 35.61 39.70 35.79 35.36 29.63 22.07 35.39 35.99 37.35 38.82
 [96] 25.80 21.31 18.85  9.52 20.75 36.83 44.12 37.79 34.45 36.05 16.39 21.84 31.39 34.26 31.50 30.87 28.88 42.83 41.52
[115] 42.34 47.35 44.47 44.10 44.49 26.89 18.17 40.44 43.93 41.56 39.98 40.31 40.59 40.17 40.22 40.50 32.68 35.89 36.06
[134] 34.30 22.67 12.56 13.29 12.34 28.00 35.27 36.57 33.78 32.15 33.58 34.62 30.96 32.06 33.05 30.66 32.47 30.42 32.83
[153] 31.74 29.39 22.39 12.58 16.46  5.36  4.01 15.32 32.79 31.66 32.02 27.60 31.47 31.61 34.96 27.77 31.91 33.94 33.43
[172] 26.94 28.38 21.42 24.51 23.82 31.71 26.64 27.96 29.29 29.25 28.70 27.02 27.62 30.90 27.46 27.37 26.46 27.77 13.61
[191]  5.87 12.18  5.68  4.15  4.35  4.42 16.42 25.18 26.06 27.39 27.57 28.86 15.18  5.19  5.61  8.28  7.78  5.13  4.90
[210]  5.02  5.27 16.31 25.01 26.19 25.96 24.93 25.53 25.56 26.39 26.80 26.73 26.00 25.61 25.90 25.89 13.80  6.66  6.41
[229]  5.28  5.64  5.71  5.38  5.76  7.20  7.27  5.55  5.31  5.94  5.75  5.93  5.77  6.57  5.52  5.51  5.47  5.69 19.75
[248] 29.22 30.75 29.63 30.49 29.48 31.83 30.42 29.27 30.40 29.91 32.00 30.09 28.93 14.54  7.75  5.63 17.17 22.27 24.93
[267] 35.94 37.42 33.13 25.88 24.27 37.64 37.42 38.33 35.20 21.32  7.32  4.81  5.17 17.49 23.77 23.36 27.60 26.53 24.99
[286] 24.22 23.76 24.10 24.22 27.06 25.53 23.40 37.07 26.52 25.19 28.02 28.53 26.67

First step, I get my data in ts

day_data_ts <- ts(x$avg_day, start = c(lbyear,low_bound_date$yday), frequency=7)

plot(day_data_ts)

plot_ts

acf(day_data_ts)

acf_ts

Second step, I get my data in msts

day_data_msts <- msts(x$avg_day, seasonal.periods=c(7,365.25), start = c(lbyear,low_bound_date$yday))

plot(day_data_msts)

acf(day_data_msts)

I did several fitting iterations to try and figure out the best fit and forecast model.

First fitting test is with the ts only.

fit1 <- HoltWinters(day_data_ts)
> fit1
    Holt-Winters exponential smoothing with trend and additive seasonal component.
    Call: HoltWinters(x = day_data_ts)
    Smoothing parameters: alpha: 1   beta : 0.006757112  gamma: 0

    Coefficients:
             [,1]
    a  28.0922449
    b   0.1652477
    s1  0.6241837
    s2  1.9084694
    s3  0.9913265
    s4  0.8198980
    s5 -1.7015306
    s6 -1.2201020
    s7 -1.4222449


fit2 <- tbats(day_data_ts)
> fit2
    BATS(1, {0,0}, 0.8, -)
    Parameters:   Alpha: 1.309966     Beta: -0.3011143    Damping Parameter: 0.800001
    Seed States:
              [,1]
    [1,] 15.282259
    [2,]  2.177787
    Sigma: 5.501356     AIC: 2723.911


fit3 <- ets(day_data_ts)
> fit3
    ETS(A,N,N) 
      Smoothing parameters: alpha = 0.9999 
      Initial states:       l = 25.2275 
      sigma:  5.8506
         AIC     AICc      BIC 
    2756.597 2756.678 2767.688 


fit4 <- auto.arima(day_data_ts)
> fit4
    ARIMA(1,1,2)                    
    Coefficients:
             ar1      ma1      ma2
          0.7396  -0.6897  -0.2769
    s.e.  0.0545   0.0690   0.0621
    sigma^2 estimated as 30.47:  log likelihood=-927.9
    AIC=1863.81   AICc=1863.94   BIC=1878.58

Second test is using msts . I also changed the ets model to MAM .

fit5 <- tbats(day_data_msts)
> fit5
    BATS(1, {0,0}, 0.8, -)
    Parameters:   Alpha: 1.309966     Beta: -0.3011143    Damping Parameter: 0.800001
    Seed States:
              [,1]
    [1,] 15.282259
    [2,]  2.177787
    Sigma: 5.501356     AIC: 2723.911


fit6 <- ets(day_data_msts, model="MAN")
> fit6
    ETS(M,A,N) 
      Smoothing parameters:     alpha = 0.9999      beta  = 9e-04 
      Initial states:           l = 52.8658         b = 3.9184 
      sigma:  0.3459
         AIC     AICc      BIC 
    3042.744 3042.949 3061.229 


fit7 <- auto.arima(day_data_msts)
> fit7
    ARIMA(1,1,2)                    
    Coefficients:
             ar1      ma1      ma2
          0.7396  -0.6897  -0.2769
    s.e.  0.0545   0.0690   0.0621
    sigma^2 estimated as 30.47:  log likelihood=-927.9
    AIC=1863.81   AICc=1863.94   BIC=1878.58

You can forecast on previously estimated model as follows (use built in timeseries LakeHuron ):

library(forecast)
y <- LakeHuron
tsdisplay(y)
# estimate ARMA(1,1)
mod_2 <- Arima(y, order = c(1, 0, 1))
#make forecast for 5 periods (years in this case)
fHuron <- forecast(mod_2, h = 5)
#show results in table
fHuron
#plot results
plot(fHuron)

This will give you: 在此处输入图片说明 Pay attention that ARIMA model bases its forecast on previous values, so if we make prediction on many periods the model will use already predicted values to predict next. Which will reduce accuracy.

To fit optimal ARIMA model use this function:

library(R.utils) #for the function 'withTimeout'
fitARIMA<-function(timeseriesObject, timout)
{
    final.aic <- Inf
    final.order <- c(0,0,0)
    for (p in 0:5) for (q in 0:5) {
        if ( p == 0 && q == 0) {
        next
        }

        arimaFit = tryCatch( 
        withTimeout(arima(timeseriesObject
                            ,order=c(p, 0, q))
                    ,timeout = timeout)
        ,error=function( err ) FALSE
        ,warning=function( err ) FALSE )

        if( !is.logical( arimaFit ) ) {
        current.aic <- AIC(arimaFit)
        if (current.aic < final.aic) {
            final.aic <- current.aic
            final.order <- c(p, 0, q)
            final.arima <- arima(timeseriesObject, order=final.order)
        }
        } else {
        next
        }
    }
    final.order<-c(final.order,final.aic)
    final.order
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM