简体繁体中英

Why does more epochs make my model worse?

原文 2018-07-16 15:47:05 4 1 python/ tensorflow/ machine-learning/ keras/ lstm

Most of my code is based on this article and the issue I'm asking about is evident there, but also in my own testing. It is a sequential model with LSTM layers.

Here is a plotted prediction over real data from a model that was trained with around 20 small data sets for one epoch.

Here is another plot but this time with a model trained on more data for 10 epochs.

What causes this and how can I fix it? Also that first link I sent shows the same result at the bottom - 1 epoch does great and 3500 epochs is terrible.

Furthermore, when I run a training session for the higher data count but with only 1 epoch, I get identical results to the second plot.

What could be causing this issue?

1 answers

A few questions:

Is this graph for training data or validation data?
Do you consider it better because:
- The graph seems cool?
- You actually have a better "loss" value?
  - If so, was it training loss?
  - Or validation loss?

Cool graph

The early graph seems interesting, indeed, but take a close look at it:

I clearly see huge predicted valleys where the expected data should be a peak

Is this really better? It sounds like a random wave that is completely out of phase, meaning that a straight line would indeed represent a better loss than this.

Take a look a the "training loss", this is what can surely tell you if your model is better or not.

If this is the case and your model isn't reaching the desired output, then you should probably make a more capable model (more layers, more units, a different method, etc.). But be aware that many datasets are simply too random to be learned, no matter how good the model.

Overfitting - Training loss gets better, but validation loss gets worse

In case you actually have a better training loss. Ok, so your model is indeed getting better.

Are you plotting training data? - Then this straight line is actually better than a wave out of phase
Are you plotting validation data?
- What is happening with the validation loss? Better or worse?

If your "validation" loss is getting worse, your model is overfitting. It's memorizing the training data instead of learning generally. You need a less capable model, or a lot of "dropout".

Often, there is an optimal point where the validation loss stops going down, while the training loss keeps going down. This is the point to stop training if you're overfitting. Read about the EarlyStopping callback in keras documentation.

Bad learning rate - Training loss is going up indefinitely

If your training loss is going up, then you've got a real problem there, either a bug, a badly prepared calculation somewhere if you're using custom layers, or simply a learning rate that is too big .

Reduce the learning rate (divide it by 10, or 100), create and compile a "new" model and restart training.

Another problem?

Then you need to detail your question properly.

Why does implementing class weights make the model worse

Gensim Word2Vec model getting worse by increasing the number of epochs

Why my Keras model is not loading through all 5 epochs as specified?

Why my tensorflow model outputs become NaN after x epochs?

Why does the loss of the LSTM model rise up over several epochs?

Should I keep training my model for more Epochs to get a better R Squared value?

Why my test accuracy falls when i use more epochs for training my CNN

Why my training speed in Keras with multi_gpu_model is worse than single gpu?

Why does my neural network have extremely low weights after a few epochs?

Why does my code throwing KeyError: 'epochs' when I implemented Fully Convolutional Networks by Keras

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Why does implementing class weights make the model worse Gensim Word2Vec model getting worse by increasing the number of epochs Why my Keras model is not loading through all 5 epochs as specified? Why my tensorflow model outputs become NaN after x epochs? Why does the loss of the LSTM model rise up over several epochs? Should I keep training my model for more Epochs to get a better R Squared value? Why my test accuracy falls when i use more epochs for training my CNN Why my training speed in Keras with multi_gpu_model is worse than single gpu? Why does my neural network have extremely low weights after a few epochs? Why does my code throwing KeyError: 'epochs' when I implemented Fully Convolutional Networks by Keras

Related Tags

Why does more epochs make my model worse?

Question

1 answers

solution1 3 ACCPTED 2018-07-16 16:24:43

Cool graph

Overfitting - Training loss gets better, but validation loss gets worse

Bad learning rate - Training loss is going up indefinitely

Another problem?

solution1
3 ACCPTED 2018-07-16 16:24:43