简体   繁体   English

我应该为每个批次使用相同的纪元吗?

[英]Should i use the same epochs for each batch?

i need to understand how the epochs/iterations affect the training of a deep learning model. 我需要了解时代/迭代如何影响深度学习模型的训练。

I am training a NER model with Spacy 2.1.3, my documents are very long so i cannot train more than 200 documents per iteration. 我正在使用Spacy 2.1.3训练NER模型,我的文档很长,因此每次迭代不能训练200个以上的文档。 So basically i do 所以基本上我会

from the document 0 to the document 200 -> 20 epochs 从文档0到文档200-> 20个纪元

from the document 201 to the document 400 -> 20 epochs 从文档201到文档400-> 20个纪元

and so on. 等等。

Maybe, it is a stupid question but, should the epochs of the next batches be the same as the first 0-200? 也许这是一个愚蠢的问题,但是下一批的时期应该与前0-200的时期相同吗? so if i chose 20 epochs i must train the next with 20 epochs too? 所以如果我选择20个时期,我也必须训练20个时期吗?

Thanks 谢谢

i need to understand how the epochs/iterations affect the training of a deep learning model - nobody is sure about that one. i need to understand how the epochs/iterations affect the training of a deep learning model -没有人能确定那个i need to understand how the epochs/iterations affect the training of a deep learning model You may overfit after certain amount of epochs, you should check you accuracy (or other metrics) on validation dataset. 在经过一定数量的时期后,您可能会过度拟合,因此应在验证数据集上检查准确性(或其他指标)。 Techniques like Early Stopping are often employed in order to battle this. 为了解决这个问题,经常使用诸如Early Stopping之类的技术。

so i cannot train more than 200 documents per iteration. - do you mean a batch of examples? -您是说一批例子吗? If so, it should be smaller (too much information in single iteration and too costly). 如果是这样,它应该更小(单次迭代中的信息太多且成本太高)。 32 is usually used for textual data, up to 64 . 文本数据通常使用32 ,最大为64 Batch sizes are often smaller the more epochs you train in order to get into the minimum better (or to escape saddle points). 批次的数量通常越小,您训练的时间越长,以便越好越好(或避开鞍点)。

Furthermore, you should use Python's generators so you can iterate over data of size bigger than your RAM capacity. 此外,您应该使用Python的生成器,以便可以对大小大于RAM容量的数据进行迭代。

Last but not least, each example is usually trained once per epoch. 最后但并非最不重要的一点是,每个示例通常每个时期训练一次。 Different approaches (say oversampling or undersampling) are sometimes used but usually when your classes distribution is imbalanced (say 10% examples belong to class 0 and 90% to class 1`) or neural network has problems with specific class (though this one requires more well thought out approach). undersampling) are sometimes used but usually when your classes distribution is imbalanced (say 10% examples belong to class不同的方法(例如oversamplingundersampling) are sometimes used but usually when your classes distribution is imbalanced (say 10% examples belong to class 0, and 90% to class undersampling) are sometimes used but usually when your classes distribution is imbalanced (say 10% examples belong to class and 90% to class )或神经网络存在特定类的问题时(尽管此方法需要更多的方法)经过深思熟虑的方法)。

The common practice is to train each batch with only 1 epoch. 通常的做法是只用1个时期训练每批。 Training on the same subset of data for 20 epochs can lead to overfitting which harms your model performance. 对20个时期的同一数据子集进行训练可能会导致过度拟合,从而损害模型性能。

To understand better how epochs number trained on each batch affect your performance you can do a grid search and compare the results. 为了更好地了解每批训练的时期数如何影响您的表现,您可以进行网格搜索并比较结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 拟合模型时,batch size 和 epoch 的数量应该有多大? - How big should batch size and number of epochs be when fitting a model? 在 tensorflow 中,为什么当使用 50 epoch 的亚当优化器运行相同的 dropout 值为 0.8 时,每次运行时会给出不同的精度? - In tensorflow why for a same dropout value of 0.8 when run with adam optimiser with 50epochs give different accuracy each time i run it? 在Keras中拟合模型时,批量大小和时期数应为多少? - How much should batch size and number of epochs be when fitting a model in Keras? 所有纪元的变量都相同 - Variables are same for all Epochs 我是否应该继续训练我的 model 以获得更好的 R 平方值? - Should I keep training my model for more Epochs to get a better R Squared value? 我应该使用类来定义将使用相同参数的函数,还是应该使用全局变量? - Should I use a class to define functions that will use the same arguments, or should I use global variables? 大型模型的最佳批量大小和时期 - Optimal batch size and epochs for large models Mini Batch Gradient Descent,adam和epochs - Mini Batch Gradient Descent, adam and epochs Keras 中的批量标准化的 output 是否取决于时期数? - Is output of Batch Normalization in Keras dependent on number of epochs? Tensorflow ValueError:预测的批处理长度应相同 - Tensorflow ValueError: Batch length of predictions should be same
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM