简体   繁体   English

Tensorflow自定义成本函数

[英]Tensorflow Custom Cost Function

I have 2 concrete classes, say A and C. I want to use an NN to classify them into classes A, B, C, such that samples that are too close to confidently classify are just classed as B. The cost function should be as follows: A misclassification, (an A classified as C, or vice versa) will have a very large cost. 我有2个具体的类,例如A和C。我想使用NN将它们分类为A,B,C类,以便将过于自信地分类的样本仅分类为B。成本函数应为如下:错误分类(将A归类为C,反之亦然)将产生非常大的成本。 A correct classification will have zero cost. 正确的分类将使成本为零。 Classifying an item as B will have very low cost. 将项目分类为B将具有非常低的成本。 The result is that we only distinguish samples that we are VERY SURE fit into their respective classes. 结果是我们只区分非常确定适合其各自类别的样本。

I have only worked through the simple tutorials in TensorFlow, but it didn't cover how to define more specific costs functions such as this. 我只研究过TensorFlow中的简单教程,但没有涵盖如何定义诸如此类的更具体的成本函数。 Can anyone explain how this can be accomplished in TensorFlow 谁能解释一下如何在TensorFlow中完成

Here is my relevant code, where I currently classify using only 2 classes. 这是我的相关代码,目前我仅使用2个类进行分类。 It is straight from the TensorFlow tutorial: 它直接来自TensorFlow教程:

cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(y, y_))
train_step = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)

y is the output of the NN (will look like [[1,0,0],[0,1,0]] for a two sample set with 3 classes), and y_ is the correct classes for the sample which might be [[1,0,0],[0,0,1]]. y是NN的输出(对于具有3个类别的两个样本集,其外观类似于[[1,0,0],[0,1,0]],而y_是样本的正确类别,可能是[[1,0,0],[0,0,1]]。 In this example, we would have classified the second sample as B because we were uncertain, but the true class was C. 在此示例中,由于不确定,我们将第二个样本归类为B,但真实类别为C。

I think you have some fundamental misunderstanding of how NN classifiers work. 我认为您对NN分类器的工作原理有一些根本性的误解。 You should probably read up on that a bit if you are going to go deeper into coding them. 如果您要对它们进行编码,您可能应该阅读一下。 I highly recommend online book by Michael Nielsen Neural Networks and Deep Learning . 我强烈推荐Michael Nielsen 神经网络和深度学习的在线书籍。

That said, the solution you are looking for is not in creating a special cost function, but in how you interpret the results you get from NN. 就是说,您正在寻找的解决方案不是创建特殊的成本函数,而是如何解释从NN获得的结果。 You do not have 3 classes, you have 2. The "I have no idea what this is" is not a class by itself, but rather a measure of NN's confidence in its answer. 您没有3个班级,而只有2个班级。“我不知道这是什么”本身不是一个班级,而是衡量NN对其答案的信心的一种度量。 So, your network should have 2 outputs, one for each class, just like in the TendorFlow guides. 因此,您的网络应该有2个输出,每个类一个输出,就像TendorFlow指南中一样。 And you should train it just like in the guides. 而且您应该像在指南中一样对其进行培训。 Once your network is trained, when you feed it a sample to classify, you get 2 numbers, lets call them A' and C'. 训练好网络后,将其输入样本进行分类时,您将获得2个数字,分别称为A'和C'。 These numbers indicate NN's confidence in what class the sample belongs to. 这些数字表明NN对样本属于哪个类别的信心。 For example, if you get A' == 0.999 and C' == 0.00001, the network is pretty damn sure that your sample is class A. If you get A' == 0.6 and C' == 0.59, your network has no idea if the sample is A or C, but slightly favors the theory that it is class A. It is now up to you to say what your confidence intervals are. 例如,如果获得A'== 0.999和C'== 0.00001,则该网络非常确定您的样本为A类。如果获得A'== 0.6和C'== 0.59,则您的网络没有确定样本是A还是C,但稍微偏爱该样本为A类的理论。现在由您来决定您的置信区间是多少。 To make this easier, you should probably use softmax for the output layer non-linearity (the way TensorFlow MNIST guides do). 为了简化操作,您可能应该对输出层非线性使用softmax(TensorFlow MNIST指南的方法)。 One of the useful features of softmax is that the sum of all your classes will always be 1 and you can easily make decisions based on the difference between A' and C'. softmax的有用功能之一是所有类的总和始终为1,并且您可以轻松地基于A'和C'之间的差来做出决策。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM