简体   繁体   中英

Optimizing a vector using eager execution

I would like to use TensorFlow's eager execution functionality to optimize the components of a vector. In all the documented examples, each trainable variable is just a scalar, with collections represented by lists of these. However, the loss function I have in mind involves performing vector manipulations on those components, and so this is inconvenient.

For example, let us use the Adam optimizer to normalize a 3-component vector:

 import tensorflow as tf
 import tensorflow.contrib.eager as tfe
 import numpy as np
 tf.enable_eager_execution()                                                                         
 def normalize(din=[2.0,1.0,0.0], lr=0.001, 
                nsteps=100):

     d = tfe.Variable(din)
     def loss(dvec):
            return tf.sqrt((1.0 - tf.tensordot(dvec, dvec, 1))**2)                                          
     def grad(dvec):
           with tf.GradientTape() as tape:
                 loss_val = loss(dvec)
           return tape.gradient(loss_val, dvec)

     optimizer = tf.train.AdamOptimizer(learning_rate=lr)
     for i in range(nsteps):
        grads = grad(d)
        optimizer.apply_gradients(zip(grads, d))  #Throws error                                                         
     return d

This code correctly computes the required gradients. However, the "optimizer.apply_gradients" line throws some kind of error, seemingly no matter what I do, essentially because tfe.Variable is not an iterable.

In this specific example the error is "AttributeError: Tensor.name is meaningless when eager execution is enabled". We could also try, for example,

  zip(grads, [d[i] for i in range(3)])

instead of d, but then the interpreter complains that d is not iterable.

What is the correct way to pair grads with d?

Optimizer.apply_gradients requires its first argument to be a list of (gradient, variable) pairs.

In the code above, neither grads nor d is a list (try print(type(grads)) for example), so the error is from the call to zip . I think what you want instead is:

optimizer.apply_gradients(zip([grads], [d]))

Or, more simply:

optimizer.apply_gradients([(grads, d)])

Also, FYI, as eager execution is stabilizing more things are moving out of the experimental "contrib" namespace, so you don't need the tfe module for your example ( tf.Variable will work just fine) if you're using a recent version of TensorFlow (1.11, 1.12 etc.). Making your whole program look like:

import tensorflow as tf
import numpy as np
tf.enable_eager_execution()                                                                         
def normalize(din=[2.0,1.0,0.0], lr=0.001, 
              nsteps=100):

    d = tf.Variable(din)
    def loss(dvec):
           return tf.sqrt((1.0 - tf.tensordot(dvec, dvec, 1))**2)                                          
    def grad(dvec):
          with tf.GradientTape() as tape:
                loss_val = loss(dvec)
          return tape.gradient(loss_val, dvec)

    optimizer = tf.train.AdamOptimizer(learning_rate=lr)
    for i in range(nsteps):
       dd = grad(d)
       optimizer.apply_gradients([(dd, d)])                                                       
    return d

Hope that helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM