Most of my code is based on this article and the issue I'm asking about is evident there, but also in my own testing. It is a sequential model with LSTM layers.
Here is a plotted prediction over real data from a model that was trained with around 20 small data sets for one epoch.
Here is another plot but this time with a model trained on more data for 10 epochs.
What causes this and how can I fix it? Also that first link I sent shows the same result at the bottom - 1 epoch does great and 3500 epochs is terrible.
Furthermore, when I run a training session for the higher data count but with only 1 epoch, I get identical results to the second plot.
What could be causing this issue?
A few questions:
The early graph seems interesting, indeed, but take a close look at it:
I clearly see huge predicted valleys where the expected data should be a peak
Is this really better? It sounds like a random wave that is completely out of phase, meaning that a straight line would indeed represent a better loss than this.
Take a look a the "training loss", this is what can surely tell you if your model is better or not.
If this is the case and your model isn't reaching the desired output, then you should probably make a more capable model (more layers, more units, a different method, etc.). But be aware that many datasets are simply too random to be learned, no matter how good the model.
In case you actually have a better training loss. Ok, so your model is indeed getting better.
If your "validation" loss is getting worse, your model is overfitting. It's memorizing the training data instead of learning generally. You need a less capable model, or a lot of "dropout".
Often, there is an optimal point where the validation loss stops going down, while the training loss keeps going down. This is the point to stop training if you're overfitting. Read about the EarlyStopping
callback in keras documentation.
If your training loss is going up, then you've got a real problem there, either a bug, a badly prepared calculation somewhere if you're using custom layers, or simply a learning rate that is too big .
Reduce the learning rate (divide it by 10, or 100), create and compile a "new" model and restart training.
Then you need to detail your question properly.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.