简体   繁体   English

使用汉明距离损失 Function 和 Tensorflow GradientTape:无梯度。 是不可微分的吗?

[英]Use Hamming Distance Loss Function with Tensorflow GradientTape: no gradients. Is it not differentiable?

I'm using Tensorflow 2.1 and Python 3, creating my custom training model following the tutorial " Tensorflow - Custom training: walkthrough ". I'm using Tensorflow 2.1 and Python 3, creating my custom training model following the tutorial " Tensorflow - Custom training: walkthrough ".

I'm trying to use Hamming Distance on my loss function:我正在尝试对我的损失使用汉明距离 function:

import tensorflow as tf
import tensorflow_addons as tfa

def my_loss_hamming(model, x, y):
  global output
  output = model(x)

  return tfa.metrics.hamming.hamming_loss_fn(y, output, threshold=0.5, mode='multilabel')


def grad(model, inputs, targets):
  with tf.GradientTape() as tape:
      tape.watch(model.trainable_variables)
      loss_value = my_loss_hamming(model, inputs, targets)

  return loss_value, tape.gradient(loss_value, model.trainable_variables)

When I call it:当我调用它时:

loss_value, grads = grad(model, feature, label)
optimizer.apply_gradients(zip(grads, model.trainable_variables))

grads variable is a list with 38 None. grads变量是一个包含 38 个无的列表。

And I get the error:我得到了错误:

No gradients provided for any variable: ['conv1_1/kernel:0', ...]

Is there any way to use Hamming Distance without "interrupts the gradient chain registered by the gradient tape"?有没有什么方法可以使用汉明距离而不“中断渐变胶带注册的渐变链”?

Apology if I'm saying something obvious, but the way how backpropagation works as a fitting algorithm for neural networks is through gradients - eg for each batch of training data you compute how much the loss function will improve/degrade if you move a particular trainable weight by a very small amount delta .抱歉,如果我说的很明显,但是反向传播作为神经网络的拟合算法的工作方式是通过梯度 - 例如,对于每批训练数据,您计算 function 损失的多少,如果您移动特定的可训练数据重量由一个非常小的量delta

Hamming loss is by definition not differentiable, so for small movements of trainable weights you will never experience any changes in the loss.根据定义,汉明损失是不可微的,因此对于可训练权重的小幅移动,您将永远不会体验到损失的任何变化。 I imagine it is only added to be used for final measurements of trained models' performance rather than for training.我想它只是被添加用于训练模型性能的最终测量,而不是用于训练。

If you want to train a neural net through backpropagation you need to use some differentiable loss - such that can help the model to move weights in the right direction.如果你想通过反向传播训练神经网络,你需要使用一些可微的损失——这样可以帮助 model 将权重向正确的方向移动。 Sometimes people use different techniques to smooth such losses as Hamming less and create approximations - eg here it could be something which would penalize less predictions which are closer to the target answer rather then just giving out 1 for everything above threshold and 0 for everything else.有时人们使用不同的技术来平滑这种损失,例如更少的汉明并创建近似值 - 例如,这里可能会惩罚更接近目标答案的更少预测,而不是对高于阈值的所有内容给出 1,对其他所有内容给出 0。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM