简体   繁体   English

如何使用不可微的损失函数?

[英]How to use a loss function that is not differentiable?

I am trying to find a codebook at the output of a fully connected neural network which chooses points such that the minimum distance (Euclidean norm) between the so produced codebook is maximized. 我试图在一个完全连接的神经网络的输出端找到一个密码本,该神经网络的选择点应使所产生的密码本之间的最小距离(欧几里得范数)最大。 The input to the neural network is the points that need to be mapped into higher dimension of the output space. 神经网络的输入是需要映射到输出空间的更高维度的点。

For instance, if the input dimension is 2 and output dimension is 3, the following mapping (and any permutations) works best: 00 - 000, 01 - 011, 10 - 101, 11 - 110 例如,如果输入维为2,输出维为3,则以下映射(以及任何排列)最有效:00-000、01-011、10-101、11-110

import tensorflow as tf
import numpy as np
import itertools


input_bits = tf.placeholder(dtype=tf.float32, shape=[None, 2], name='input_bits')
code_out = tf.placeholder(dtype=tf.float32, shape=[None, 3], name='code_out')
np.random.seed(1331)


def find_code(message):
    weight1 = np.random.normal(loc=0.0, scale=0.01, size=[2, 3])
    init1 = tf.constant_initializer(weight1)
    out = tf.layers.dense(inputs=message, units=3, activation=tf.nn.sigmoid, kernel_initializer=init1)
    return out


code = find_code(input_bits)

distances = []
for i in range(0, 3):
    for j in range(i+1, 3):
        distances.append(tf.linalg.norm(code_out[i]-code_out[j]))
min_dist = tf.reduce_min(distances)
# avg_dist = tf.reduce_mean(distances)

loss = -min_dist

opt = tf.train.AdamOptimizer().minimize(loss)

init_variables = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init_variables)

saver = tf.train.Saver()

count = int(1e4)

for i in range(count):
    input_bit = [list(k) for k in itertools.product([0, 1], repeat=2)]
    code_preview = sess.run(code, feed_dict={input_bits: input_bit})
    sess.run(opt, feed_dict={input_bits: input_bit, code_out: code_preview})

Since the loss function itself is not differentiable, I am getting the error 由于损失函数本身不可微,所以我得到了错误

ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables 

Am I doing something silly or is there a way to circumvent this? 我是在做傻事还是有办法避免这种情况? Any help in this regard is appreciated. 在这方面的任何帮助表示赞赏。 Thanks in advance. 提前致谢。

Your loss function has to be differentiable with respect to some parameters. 您的损失函数在某些参数上必须是可微的。 In your case, there are no parameters therefore you would be computing a derivative of a constant function which is 0. Moreover, in your code you have the following line: 在您的情况下,没有参数,因此您将计算常数函数的导数为0。此外,在您的代码中,您有以下几行:

code = find_code(input_bits)

which is not used any further. 不再使用。 Based on the code, I assume that you want to change this line: 根据代码,假设您要更改此行:

distances.append(tf.linalg.norm(code_out[i]-code_out[j]))

to: 至:

distances.append(tf.linalg.norm(code[i]-code_out[j]))

Therefore, you would be using the tf.layers.dense that you have, thus include a parameter which can be used to compute the gradient of the loss with respect to that parameter. 因此,您将使用自己拥有的tf.layers.dense ,从而包含一个可用于计算损耗相对于该参数的梯度的参数。


Moreover, you don't need to worry whether a TF operation is differentiable or not. 此外,您不必担心TF操作是否可微。 In fact, all TF ops are differentiable. 实际上,所有TF操作都是可区分的。 When it comes to the tf.reduce_min() , please check this link . 当涉及到tf.reduce_min() ,请检查此链接

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM