简体   繁体   中英

Eagerly update a keras model's weights directly using the gradient

I am writing a custom optimizer with Eager Execution in Ternsorflow 1.15 but can't figure out how to update the weights. Taking gradient descent as an example, I have the weights, the gradient and a scalar learning rate but can't figure out how to combine them.

This is an implementation of gradient descent where model is a keras.Model eg a multilayer CNN:

lr = tf.constant(0.01)

def minimize(model, inputs, targets):
    with tf.GradientTape() as tape:
        logits = model(input)
        loss_value = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=targets)

    grad = tape.gradient(loss_value, model.trainable_variables)
    step = tf.multiply(self.lr, grad)
    model.trainable_variables.assign_sub(step)

but it fails on the tf.multiply saying

tensorflow.python.framework.errors_impl.InvalidArgumentError: Shapes of all inputs must match: values[0].shape = [5,5,1,6] != values[1].shape = [6] [Op:Pack] name: packed

I also know the last line will fail as trainable_variables is a list and doesn't have the method assign_sub .


How can I rewrite the last two lines of my code to do:

model.trainable_variables -= lr * grad

Figured it out. As both are lists we need to iterate through their pairs of gradients and variables for each layer together and update each of these separately.

lr = tf.constant(0.01)

def minimize(model, inputs, targets):
    with tf.GradientTape() as tape:
        logits = model(input)
        loss_value = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=targets)

    grad = tape.gradient(loss_value, model.trainable_variables)
    for v, g in zip(model.trainable_variables, grad):
        v.assign_sub(lr * g)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM