I would like to use TensorFlow's eager execution functionality to optimize the components of a vector. In all the documented examples, each trainable variable is just a scalar, with collections represented by lists of these. However, the loss function I have in mind involves performing vector manipulations on those components, and so this is inconvenient.
For example, let us use the Adam optimizer to normalize a 3-component vector:
import tensorflow as tf
import tensorflow.contrib.eager as tfe
import numpy as np
tf.enable_eager_execution()
def normalize(din=[2.0,1.0,0.0], lr=0.001,
nsteps=100):
d = tfe.Variable(din)
def loss(dvec):
return tf.sqrt((1.0 - tf.tensordot(dvec, dvec, 1))**2)
def grad(dvec):
with tf.GradientTape() as tape:
loss_val = loss(dvec)
return tape.gradient(loss_val, dvec)
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
for i in range(nsteps):
grads = grad(d)
optimizer.apply_gradients(zip(grads, d)) #Throws error
return d
This code correctly computes the required gradients. However, the "optimizer.apply_gradients" line throws some kind of error, seemingly no matter what I do, essentially because tfe.Variable is not an iterable.
In this specific example the error is "AttributeError: Tensor.name is meaningless when eager execution is enabled". We could also try, for example,
zip(grads, [d[i] for i in range(3)])
instead of d, but then the interpreter complains that d is not iterable.
What is the correct way to pair grads with d?
Optimizer.apply_gradients
requires its first argument to be a list of (gradient, variable) pairs.
In the code above, neither grads
nor d
is a list (try print(type(grads))
for example), so the error is from the call to zip
. I think what you want instead is:
optimizer.apply_gradients(zip([grads], [d]))
Or, more simply:
optimizer.apply_gradients([(grads, d)])
Also, FYI, as eager execution is stabilizing more things are moving out of the experimental "contrib" namespace, so you don't need the tfe
module for your example ( tf.Variable
will work just fine) if you're using a recent version of TensorFlow (1.11, 1.12 etc.). Making your whole program look like:
import tensorflow as tf
import numpy as np
tf.enable_eager_execution()
def normalize(din=[2.0,1.0,0.0], lr=0.001,
nsteps=100):
d = tf.Variable(din)
def loss(dvec):
return tf.sqrt((1.0 - tf.tensordot(dvec, dvec, 1))**2)
def grad(dvec):
with tf.GradientTape() as tape:
loss_val = loss(dvec)
return tape.gradient(loss_val, dvec)
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
for i in range(nsteps):
dd = grad(d)
optimizer.apply_gradients([(dd, d)])
return d
Hope that helps!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.