简体   繁体   中英

How to get the prediction of new data by LSTM in python

This is a univariate time series prediction problem. As the following code shows, I divide the initial data into a train dataset ( trainX ) and a test dataset( testX ), then I create a LSTM network by keras. Next, I train the model by the train dataset. However, when I want to get the prediction, I need to know the test value, so my problem is: why do I have to predict since I have known the true value which is test dataset in this problem. What I want to get is the prediction value of future time? If I have some misunderstandings about LSTM network, please tell me.

Thank you!

# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)
# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

Since we don't have the future value with us while training the model, we just divide the data into train and test sets. Then we just imagine that test sets are the future values. We train our model using train set (and also usually a validation set). And after our model is trained, we test it using the test set to check our models performance.

why do I have to predict since I have known the true value which is test dataset in this problem. What I want to get is the prediction value of future time?

In ML, we give test data X and it returns us Y. In the case of time-series, it may mislead a beginner a bit as we use the X and output is apparently X as well: The difference here is that we are inputting old values of time-series as X and the output Y is value of same time-series but we are predicting in future (can be applied for present or even past as well) as you have identified it correctly.

(PS: I would recommend you to begin with simple regression and then come to LSTMs etc. if all you want is to learn the Machine Learning.)

I think the correct term in this context is 'Forecasting'.

A good explanation is: after you train and test your model, with the data that you already had (as the other ones said here before me), you want to predict future data, which is, I think, the trully interresting thing about recurrent networks.

So in order to make this, you need to start predicting the values from one day after your final date in your original dataset, using the model (which is trained with this past data). Once you predict this value, you do the same thing, but considering the last values predict, and so on.

The fact that you are using a prediction to make others predictions, implies that is much more difficult to get good results, so is common to try to predict short ranges of time.

The exact code that you need to perform to do this could vary, but I think that is the prime concept

In the link below, in the last part, in which is perform a forecast, the author show us a code and a explanation on how he did it.

https://towardsdatascience.com/time-series-forecasting-with-recurrent-neural-networks-74674e289816

I guess that's it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM