简体繁体 English

学习率对神经网络的影响

[英]Effect of learning rate on the neural network

原文 2018-08-05 02:08:19 2 1 machine-learning/ neural-network/ training-data

I have a dataset of about a 100 numeric values. 我有大约100个数值的数据集。 I have set the learning rate of my neural network to 0.0001. 我将神经网络的学习率设置为0.0001。 I have successfully trained it on the dataset for over 1 million times. 我已经成功地对数据集进行了超过一百万次的训练。 But my question is that what is the effect of very low learning rates in the neural networks ? 但是我的问题是，非常低的学习率对神经网络有什么影响？

1 个解决方案

Low learning rate mainly implies slow convergence: you're moving down the loss function with smaller steps (the step size is the learning rate). 低学习率主要意味着收敛速度较慢：您正在以较小的步长向下移动损失函数（步长为学习率）。 If your function is convex this is not a problem, you will wait more but you'll reach a good solution. 如果您的函数是凸的，这不是问题，您将等待更多，但您将获得一个好的解决方案。

If, as in the case of deep neural networks, your function is not convex than a low learning rate could lead to the reaching of a "good" optimum that's not the best one (getting stuck in a local minimum, without making steps as big as required to jump out of it). 如果，例如在深层神经网络中，您的功能不是凸的，而学习率较低，则可能会导致达到“最佳”的最佳状态，而不是最佳的最佳状态（陷入局部最小值，而步幅不那么大）根据需要跳出）。

That's why there are different optimization algorithms that are adaptive : such algorithm, like ADAM, RMSProp, ... have different learning rates for each weight in the network (every single learning rate starts from the same value). 这就是为什么存在不同的自适应优化算法的原因：ADAM，RMSProp等算法对网络中的每个权重都有不同的学习率（每个学习率都从相同的值开始）。 In this way, the optimization algorithm can work on every single parameter independently with the aim of finding a better solution (and letting the chose of the initial learning rate less critical) 这样，优化算法可以独立地针对每个参数工作，以期找到更好的解决方案（并使初始学习率的选择不再那么关键）。