简体   繁体   English

Keras:使用 Adadelta 优化器时学习率如何变化?

[英]Keras: how learning rate changes when Adadelta optimizer is used?

For example I use Adadelta for optimizer when compile network model, then learning rate will change in time by this rule (but what is iterations ? ) and how can I log learning rate value to console?例如,我在编译网络模型时使用Adadelta作为优化器,然后learning rate将根据此规则及时改变(但什么是iterations ?)以及如何将学习率值记录到控制台?

model.compile(loss=keras.losses.mean_squared_error,
        optimizer= keras.optimizers.Adadelta())

In documentation lr is just starting learning rate?文档中lr刚刚开始学习率?

The rule is related to updates with decay.该规则与衰减更新有关。 Adadelta is an adaptive learning rate method which uses exponentially decaying average of gradients. Adadelta 是一种自适应学习率方法,它使用梯度的指数衰减平均值。

Looking at Keras source code , learning rate is recalculated based on decay like:查看Keras 源代码,学习率是根据衰减重新计算的,例如:

lr = self.lr
if self.initial_decay > 0:
    lr *= (1. / (1. + self.decay * K.cast(self.iterations, K.dtype(self.decay))))

So yes, lr is just starting learning rate.所以是的, lr刚刚开始学习率。

To print it after every epoch, as @orabis mentioned, you can make a callback class:要在每个时代之后打印它,正如@orabis 提到的,您可以创建一个回调类:

class YourLearningRateTracker(Callback):
    def on_epoch_end(self, epoch, logs=None):
        lr = self.model.optimizer.lr
        decay = self.model.optimizer.decay
        iterations = self.model.optimizer.iterations
        lr_with_decay = lr / (1. + decay * K.cast(iterations, K.dtype(decay)))
        print(K.eval(lr_with_decay))

and then add its instance to the callbacks when calling model.fit() like:然后在调用model.fit()时将其实例添加到回调中,例如:

model.fit(..., callbacks=[YourLearningRateTracker()])

However, note that, by default, decay parameter for Adadelta is zero and is not part of the “standard” arguments, so your learning rate would not be changing its value when using default arguments.但是,请注意,默认情况下,Adadelta 的decay参数为零,并且不是“标准”参数的一部分,因此在使用默认参数时,您的学习率不会改变其值。 I suspect that decay is not intended to be used with Adadelta.我怀疑衰减不打算与 Adadelta 一起使用。

On the other hand, rho parameter, which is nonzero by default, doesn't describe the decay of the learning rate, but corresponds to the fraction of gradient to keep at each time step (according to the Keras documentation ).另一方面,默认情况下非零的rho参数不描述学习率的衰减,而是对应于每个时间步保持的梯度分数(根据Keras 文档)。

I found some relevant information on this Github issue , and by asking a similar question .我发现了一些关于这个 Github 问题的相关信息,并通过问一个类似的问题

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Keras的Adadelta Optimizer中的学习速率参数是什么? - What Is the Learning Rate Parameter in Adadelta Optimiser for in Keras? 为什么 Adadelta 优化器不会降低学习率? - Why doesn't the Adadelta optimizer decay the learning rate? 如何在pytorch中打印Adadelta中的“实际”学习率 - How to print the “actual” learning rate in Adadelta in pytorch 张量流Adadelta中的学习率 - Learning rate in tensorflow Adadelta 当我使用 tf.keras.optimizers.schedules.ExponentialDecay 时,如何在 TensorFlow 2.0 中获取 SGD 优化器的当前学习率? - How to get current learning rate of SGD optimizer in TensorFlow 2.0 when I use tf.keras.optimizers.schedules.ExponentialDecay? 如何使用 Keras 中的 Adam 优化器在每个时期打印学习率? - How can I print the Learning Rate at each epoch with Adam optimizer in Keras? 如何使用 keras 根据之前的 epoch 精度更改学习率 我正在使用 SGD 优化器? - How to change the learning rate based on the previous epoch accuracy using keras I am using an SGD optimizer? 修改 Tensorflow (Keras) Optimizer(用于 Layerwise Learning Rate Multipliers) - Modify Tensorflow (Keras) Optimizer (for Layerwise Learning Rate Multipliers) tf.Keras 学习率计划——传递给优化器还是回调? - tf.Keras learning rate schedules—pass to optimizer or callbacks? 从中获取学习率<tensorflow.python.keras.optimizer_v2.learning_rate_schedule.cosinedecay> Object </tensorflow.python.keras.optimizer_v2.learning_rate_schedule.cosinedecay> - Get Learning Rate from <tensorflow.python.keras.optimizer_v2.learning_rate_schedule.CosineDecay> Object
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM