简体   繁体   中英

Using a custom step activation function in Keras results in “An operation has `None` for gradient.” error. How to resolve this?

I am building auto-encoder and I want to encode my values into a logical matrix. However, when I'm using my custom step activation function in one of the intermediate layers (all other layers are using 'relu'), keras raises this error:

An operation has `None` for gradient.

I've tried using hard-sigmoid function, but it doesn't fit my problem, because it still produces intermediate values, when I only need binary. I am aware, that at most points my function has no gradient, but is it possible to use some other function for gradient calculation and still use step function for accuracy and loss calculations?

My activation function:

def binary_activation(x):
    ones = tf.ones(tf.shape(x), dtype=x.dtype.base_dtype)
    zeros = tf.zeros(tf.shape(x), dtype=x.dtype.base_dtype)
    return keras.backend.switch(x > 0.5, ones, zeros)

I expect to be able to use binary step activation function to train the network and then use it as a typical auto-encoder. Something simmilar to binary feature map used in this paper .

As mentioned here , you could use tf.custom_gradient to define a "back-propagatable" gradient for your activation function.

Perhaps something like:

@tf.custom_gradient
def binary_activation(x):

    ones = tf.ones(tf.shape(x), dtype=x.dtype.base_dtype)
    zeros = tf.zeros(tf.shape(x), dtype=x.dtype.base_dtype)

    def grad(dy):
        return ...  # TODO define gradient
  return keras.backend.switch(x > 0.5, ones, zeros), grad

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM