I am writing a custom optimizer with Eager Execution in Ternsorflow 1.15 but can't figure out how to update the weights. Taking gradient descent as an example, I have the weights, the gradient and a scalar learning rate but can't figure out how to combine them.
This is an implementation of gradient descent where model is a keras.Model
eg a multilayer CNN:
lr = tf.constant(0.01)
def minimize(model, inputs, targets):
with tf.GradientTape() as tape:
logits = model(input)
loss_value = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=targets)
grad = tape.gradient(loss_value, model.trainable_variables)
step = tf.multiply(self.lr, grad)
model.trainable_variables.assign_sub(step)
but it fails on the tf.multiply
saying
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shapes of all inputs must match: values[0].shape = [5,5,1,6] != values[1].shape = [6] [Op:Pack] name: packed
I also know the last line will fail as trainable_variables
is a list and doesn't have the method assign_sub
.
How can I rewrite the last two lines of my code to do:
model.trainable_variables -= lr * grad
Figured it out. As both are lists we need to iterate through their pairs of gradients and variables for each layer together and update each of these separately.
lr = tf.constant(0.01)
def minimize(model, inputs, targets):
with tf.GradientTape() as tape:
logits = model(input)
loss_value = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=targets)
grad = tape.gradient(loss_value, model.trainable_variables)
for v, g in zip(model.trainable_variables, grad):
v.assign_sub(lr * g)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.