[英]Importance of number of steps in an epoch for LSTM model training in Keras
Whats the difference between two LSTM models A and B that are trained on same data, but the batches are shuffled randomly for each epoch, that A has 14 steps per epoch and B has 132 steps per epoch?在相同数据上训练的两个 LSTM 模型 A 和 B 之间有什么区别,但是每个 epoch 的批次是随机打乱的,A 每个 epoch 有 14 步,B 每个 epoch 有 132 步? Which one will perform better in validation?哪一个在验证中表现更好?
An epoch consists of going through all your training samples once.一个 epoch 包括一次遍历所有的训练样本。 And one step/iteration refers to training over a single minibatch.一步/迭代是指对单个小批量进行训练。 So if you have 1,000,000 training samples and use a batch size of 100, one epoch will be equivalent to 10,000 steps, with 100 samples per step.因此,如果您有 1,000,000 个训练样本并使用 100 的批量大小,则一个 epoch 将相当于 10,000 步,每步 100 个样本。
A high-level neural network framework may let you set either the number of epochs or total number of training steps.高级神经网络框架可以让您设置时期数或训练步骤总数。 But you can't set them both since one directly determines the value of the other.但是你不能同时设置它们,因为一个直接决定另一个的值。
Effect of Batch Size on Model Behavior: Small batch results generally in rapid learning but a volatile learning process with higher variance.批量大小对模型行为的影响:小批量通常会导致快速学习,但学习过程不稳定,具有更高的方差。 Larger batch sizes slow down the learning process but the final stages result in a convergence to a more stable model exemplified by lower variance.较大的批量会减慢学习过程,但最后阶段会收敛到更稳定的模型,例如较低的方差。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.