[英]What is the difference between step size and learning rate in machine learning?
I am using TensorFlow to implement some basic ML code.我正在使用 TensorFlow 来实现一些基本的 ML 代码。 I was wondering if anyone could give me a short explanation of the meaning of and difference between step size and learning rate in the following functions.
我想知道是否有人可以简短地解释以下函数中步长和学习率之间的含义和区别。
I used tf.train.GradientDescentOptimizer() to set the parameter learning rate and linear_regressor.train() to set the number of steps.我使用tf.train.GradientDescentOptimizer()来设置参数学习率和linear_regressor.train()来设置步数。 I've been looking through the documentation on tensorflow.org for these functions but I still do not have a complete grasp of the meaning of these parameters.
我一直在查看 tensorflow.org 上关于这些函数的文档,但我仍然没有完全掌握这些参数的含义。
Thank you and let me know if there is any more info I can provide.谢谢你,如果我能提供更多信息,请告诉我。
In SGD, you compute the gradient for a batch and move the parameters in the direction of said gradient by an amount defined by the learning rate lr
:在 SGD 中,您计算批次的梯度,并按照由学习率
lr
定义的量沿所述梯度的方向移动参数:
params=old_params - lr* grad
where grad
is the gradient of the loss wrt the params.其中
grad
是参数损失的梯度。
The step
in tensorflow or similar libraries usually just denotes the number of such updates per epoch. tensorflow 或类似库中的
step
通常只表示每个 epoch 的此类更新数量。 So if you have step=1000
and lr=0.5
, you will be calling the pseudocode above 1000
times with lr=0.5
in each epoch.因此,如果您有
step=1000
和lr=0.5
,您将在每个时期以lr=0.5
调用伪代码1000
次以上。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.