[英]Custom learning rate scheduler TF2 and Keras
I am trying to write custom learning rate scheduler: cosine annealing with warm-up.我正在尝试编写自定义学习率调度程序:带预热的余弦退火。 But I can't use it neither in Keras, nor in Tensorflow.
但是我既不能在 Keras 中使用它,也不能在 Tensorflow 中使用它。 Below is the code:
下面是代码:
import tensorflow as tf
import numpy as np
def make_linear_lr(min_lr, max_lr, number_of_steps):
def gen_lr(step):
return (max_lr - min_lr) / number_of_steps * step + min_lr
return gen_lr
def make_cosine_anneal_lr(learning_rate, alpha, decay_steps):
def gen_lr(global_step):
global_step = min(global_step, decay_steps)
cosine_decay = 0.5 * (1 + np.cos(np.pi * global_step / decay_steps))
decayed = (1 - alpha) * cosine_decay + alpha
decayed_learning_rate = learning_rate * decayed
return decayed_learning_rate
return gen_lr
def make_cosine_annealing_with_warmup(min_lr, max_lr, number_of_steps, alpha, decay_steps):
gen_lr_1 = make_linear_lr(min_lr, max_lr, number_of_steps)
gen_lr_2 = make_cosine_anneal_lr(max_lr, alpha, decay_steps)
def gen_lr(global_step):
if global_step < number_of_steps:
return gen_lr_1(global_step)
else:
return gen_lr_2(global_step - number_of_steps)
return gen_lr
class CosineAnnealingWithWarmUP(tf.keras.optimizers.schedules.LearningRateSchedule):
def __init__(self, min_lr, max_lr, number_of_steps, alpha, decay_steps):
super(CosineAnnealingWithWarmUP, self).__init__()
self.gen_lr_ca = make_cosine_annealing_with_warmup(min_lr, max_lr, number_of_steps, alpha, decay_steps)
def __call__(self, step):
return tf.cast(self.gen_lr_ca(step), tf.float32)
learning_rate_fn = CosineAnnealingWithWarmUP(.0000001, 0.01, 10_000, 0, 150_000)
optimizer=tf.keras.optimizers.SGD(
learning_rate=learning_rate_fn,
momentum=0.95)
I use this function in TensorFlow to train my model:我在 TensorFlow 中使用这个函数来训练我的模型:
def get_model_train_step_function(model, optimizer, vars_to_fine_tune, batch_size):
@tf.function
def train_step_fn(image_tensors,
groundtruth_boxes_list,
groundtruth_classes_list):
shapes = tf.constant(batch_size * [[640, 640, 3]], dtype=tf.int32)
model.provide_groundtruth(
groundtruth_boxes_list=groundtruth_boxes_list,
groundtruth_classes_list=groundtruth_classes_list)
with tf.GradientTape() as tape:
preprocessed_images = tf.concat(
[model.preprocess(
image_tensor
)[0]
for image_tensor in image_tensors], axis=0)
prediction_dict = model.predict(preprocessed_images, shapes)
losses_dict = model.loss(prediction_dict, shapes)
total_loss = losses_dict['Loss/localization_loss'] + losses_dict['Loss/classification_loss']
gradients = tape.gradient(total_loss, vars_to_fine_tune)
optimizer.apply_gradients(zip(gradients, vars_to_fine_tune))
return total_loss
return train_step_fn
When I try to use it with TensorFlow, passing optimizer in get_model_train_step_function — it works if I remove @tf.function decorator.当我尝试将它与 TensorFlow 一起使用时,在 get_model_train_step_function 中传递优化器 - 如果我删除 @tf.function 装饰器,它会起作用。 But it doesn't work with @tf.function, the error says: OperatorNotAllowedInGraphError: using a
tf.Tensor
as a Python bool
is not allowed: AutoGraph did convert this function.但它不适用于@tf.function,错误说:OperatorNotAllowedInGraphError:不允许使用
tf.Tensor
作为 Python bool
:AutoGraph 确实转换了这个函数。 This might indicate you are trying to use an unsupported feature.这可能表明您正在尝试使用不受支持的功能。
How should I write my custom learning rate scheduler?我应该如何编写自定义学习率调度程序? Also, I would like to use this Schedule with Keras.
另外,我想将此 Schedule 与 Keras 一起使用。 But it doesn't work there at all.
但它在那里根本不起作用。
You need to exclude numpy calls and replace python conditionals ("if", "min") by tensorflow operators:您需要排除 numpy 调用并用 tensorflow 运算符替换 python 条件(“if”、“min”):
def make_cosine_anneal_lr(learning_rate, alpha, decay_steps):
def gen_lr(global_step):
#global_step = min(global_step, decay_steps)
global_step = tf.minimum(global_step, decay_steps)
cosine_decay = 0.5 * (1 + tf.cos(3.1415926 * global_step / decay_steps)) # changed np.pi to 3.14
decayed = (1 - alpha) * cosine_decay + alpha
decayed_learning_rate = learning_rate * decayed
return decayed_learning_rate
return gen_lr
def make_cosine_annealing_with_warmup(min_lr, max_lr, number_of_steps, alpha, decay_steps):
gen_lr_1 = make_linear_lr(min_lr, max_lr, number_of_steps)
gen_lr_2 = make_cosine_anneal_lr(max_lr, alpha, decay_steps)
def gen_lr(global_step):
#if global_step < number_of_steps:
# return gen_lr_1(global_step)
#else:
# return gen_lr_2(global_step - number_of_steps)
a = global_step < number_of_steps
a = tf.cast(a, tf.float32)
b = 1. - a
return a * gen_lr_1(global_step) + b * gen_lr_2(global_step - number_of_steps)
return gen_lr
Such schedule works from Keras.这样的时间表来自 Keras。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.