[英]Why does more epochs make my model worse?
Most of my code is based on this article and the issue I'm asking about is evident there, but also in my own testing. 我的大部分代码都基于本文,而我要问的问题在这里很明显,而且在我自己的测试中也很明显。 It is a sequential model with LSTM layers.
它是具有LSTM层的顺序模型。
Here is a plotted prediction over real data from a model that was trained with around 20 small data sets for one epoch.
这是对来自模型的真实数据的绘制预测,该模型使用一个时期用大约20个小数据集进行训练。
Here is another plot but this time with a model trained on more data for 10 epochs. 这是另一幅图,但是这次是使用模型训练了10个时期的更多数据。
What causes this and how can I fix it? 是什么原因造成的,我该如何解决? Also that first link I sent shows the same result at the bottom - 1 epoch does great and 3500 epochs is terrible.
同样,我发送的第一个链接在底部显示了相同的结果-1个时期确实很棒,而3500个时期非常糟糕。
Furthermore, when I run a training session for the higher data count but with only 1 epoch, I get identical results to the second plot. 此外,当我为一个较高的数据计数但只有一个时期运行训练时,我得到的结果与第二个图相同。
What could be causing this issue? 是什么导致此问题?
A few questions: 几个问题:
The early graph seems interesting, indeed, but take a close look at it: 确实,早期的图表似乎很有趣,但请仔细看一下:
I clearly see huge predicted valleys where the expected data should be a peak
我清楚地看到了预期的数据应该达到峰值的巨大预测谷
Is this really better? 这真的更好吗? It sounds like a random wave that is completely out of phase, meaning that a straight line would indeed represent a better loss than this.
听起来像是完全异相的随机波,这意味着直线确实比这更好。
Take a look a the "training loss", this is what can surely tell you if your model is better or not. 看看“训练损失”,这肯定可以告诉您您的模型是否更好。
If this is the case and your model isn't reaching the desired output, then you should probably make a more capable model (more layers, more units, a different method, etc.). 如果是这种情况,而您的模型没有达到所需的输出,那么您可能应该制作一个功能更强大的模型(更多的层,更多的单元,不同的方法等)。 But be aware that many datasets are simply too random to be learned, no matter how good the model.
但是请注意,无论模型多么出色,许多数据集都是太随机而无法学习。
In case you actually have a better training loss. 如果您实际上有更好的训练损失。 Ok, so your model is indeed getting better.
好的,所以您的模型确实在变好。
If your "validation" loss is getting worse, your model is overfitting. 如果您的“验证”损失越来越严重,则表明您的模型过度拟合。 It's memorizing the training data instead of learning generally.
它是在记忆训练数据,而不是一般地学习。 You need a less capable model, or a lot of "dropout".
您需要功能较弱的模型,或大量的“辍学”模型。
Often, there is an optimal point where the validation loss stops going down, while the training loss keeps going down. 通常,在最佳点上,验证损失会停止下降,而训练损失会持续下降。 This is the point to stop training if you're overfitting.
如果您过度健身,这是停止训练的关键。 Read about the
EarlyStopping
callback in keras documentation. 在keras文档中阅读有关
EarlyStopping
回调的信息。
If your training loss is going up, then you've got a real problem there, either a bug, a badly prepared calculation somewhere if you're using custom layers, or simply a learning rate that is too big . 如果您的培训损失在增加,那么您就遇到了一个真正的问题,要么是错误,要么是使用自定义图层的地方某个计算准备不好,或者仅仅是学习率太大 。
Reduce the learning rate (divide it by 10, or 100), create and compile a "new" model and restart training. 降低学习率(将其除以10或100),创建并编译“新”模型,然后重新开始训练。
Then you need to detail your question properly. 然后,您需要适当地详细说明您的问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.