简体   繁体   中英

Why does more epochs make my model worse?

Most of my code is based on this article and the issue I'm asking about is evident there, but also in my own testing. It is a sequential model with LSTM layers.

这里 Here is a plotted prediction over real data from a model that was trained with around 20 small data sets for one epoch.

Here is another plot but this time with a model trained on more data for 10 epochs.

在此处输入图片说明

What causes this and how can I fix it? Also that first link I sent shows the same result at the bottom - 1 epoch does great and 3500 epochs is terrible.

Furthermore, when I run a training session for the higher data count but with only 1 epoch, I get identical results to the second plot.

What could be causing this issue?

A few questions:

  • Is this graph for training data or validation data?
  • Do you consider it better because:
    • The graph seems cool?
    • You actually have a better "loss" value?
      • If so, was it training loss?
      • Or validation loss?

Cool graph

The early graph seems interesting, indeed, but take a close look at it:

I clearly see huge predicted valleys where the expected data should be a peak

Is this really better? It sounds like a random wave that is completely out of phase, meaning that a straight line would indeed represent a better loss than this.

Take a look a the "training loss", this is what can surely tell you if your model is better or not.

If this is the case and your model isn't reaching the desired output, then you should probably make a more capable model (more layers, more units, a different method, etc.). But be aware that many datasets are simply too random to be learned, no matter how good the model.

Overfitting - Training loss gets better, but validation loss gets worse

In case you actually have a better training loss. Ok, so your model is indeed getting better.

  • Are you plotting training data? - Then this straight line is actually better than a wave out of phase
  • Are you plotting validation data?
    • What is happening with the validation loss? Better or worse?

If your "validation" loss is getting worse, your model is overfitting. It's memorizing the training data instead of learning generally. You need a less capable model, or a lot of "dropout".

Often, there is an optimal point where the validation loss stops going down, while the training loss keeps going down. This is the point to stop training if you're overfitting. Read about the EarlyStopping callback in keras documentation.

Bad learning rate - Training loss is going up indefinitely

If your training loss is going up, then you've got a real problem there, either a bug, a badly prepared calculation somewhere if you're using custom layers, or simply a learning rate that is too big .

Reduce the learning rate (divide it by 10, or 100), create and compile a "new" model and restart training.

Another problem?

Then you need to detail your question properly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM