简体   繁体   中英

Predict Future values by using OPERA package in R

I have been trying to understand Opera “Online Prediction by Expert Aggregation” by Pierre Gaillard and Yannig Goude. I read two posts by Pierre Gaillard ( http://pierre.gaillard.me/opera.html ) and Rob Hyndman ( https://robjhyndman.com/hyndsight/forecast-combinations/ ). However, I do not understand how to predict future values. In Pierre's example, newY = Y represents the test data set (Y <- data_test$Load) which is weekly observations of the French electric load. As you shown below, the data ends at December 2009. Now, how can I forecast let's say 2010 values? What will be the newY here?

> tail(electric_load,5)
Time Day Month Year   NumWeek     Load    Load1      Temp     Temp1  IPI 

727  727  30    11 2009 0.9056604 63568.79 58254.42  7.220536 10.163839 91.3    88.4
728  728   7    12 2009 0.9245283 63977.13 63568.79  6.808929  7.220536 90.1    87.7
729  729  14    12 2009 0.9433962 78046.85 63977.13 -1.671280  6.808929 90.1    87.7
730  730  21    12 2009 0.9622642 66654.69 78046.85  4.034524 -1.671280 90.1    87.7
731  731  28    12 2009 0.9811321 60839.71 66654.69  7.434115  4.034524 90.1    87.7

I noticed that by multiplying the weights of MLpol0 by X, we get similar outputs as online predictions values.

> weights <- predict(MLpol0, X, Y, type='weights')
> w<-weights[,1]*X[,1]+weights[,2]*X[,2]+weights[,3]*X[,3]
> predValues <- predict(MLpol0, newexpert = X, newY = Y, type='response')


 Test_Data predValues  w
620  65564.29 65017.11 65017.11
621  62936.07 62096.12 62096.12
622  64953.83 64542.44 64542.44
623  61580.44 60447.63 60447.63
624  71075.52 67622.97 67622.97
625  75399.88 72388.64 72388.64
626  65410.13 67445.63 67445.63
627  65815.15 62623.64 62623.64
628  65251.90 64271.97 64271.97
629  63966.91 61803.77 61803.77
630  64893.42 65793.14 65793.14
631  69226.32 67153.80 67153.80

But still I am not sure how to generate weights w/out newY. Maybe we can use final coefficients that are the output of MLpol to predict future values?

 (c<-summary(MLpol <- mixture(Y = Y, experts = X, model = "MLpol", loss.type = "square"))$coefficients)
[1] 0.585902 0.414098 0.000000

I am sorry I may be way off on this and my question may not make sense at all, but I really appreciate any help/insight.

The idea of the opera package is a bit different from classical batch machine learning methods with a training set and a testing set. The goal is to make sequential predictions:

At each round t=1,...,n, 1) the algorithm receives predictions of the expert for round n+1, 2) it makes a prediction for this time step by combining the expert 3) it updates the weights used for the combination by using the new output

If you have out-of-sample forecasts (ie, forecasts of experts for future values without the outputs), the best you can do is to use the last coefficients and use them to make a prediction by using:

    newexperts %*% model$coefficients

In practice, you may also want to use the averaged coefficients. You can also obtained the same by using

    predict (object, # for exemple, mixture(model='FS', loss.type="square")
           newexperts = # matrix of out-of-sample experts predictions
           online = FALSE,
           type = 'response')

By using the parameter online = FALSE the model does not need any newY. It will not update the model. When you provide newY, the algorithm does not cheat. It does not use the value at rount t to make the prediction at round t. The values of newY are only used to update the coefficients step by step to do as if the prediction were made sequentially.

I hope this helped.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM