简体   繁体   English

实现二元交叉熵损失给出了与 Tensorflow 不同的答案

[英]Implementing Binary Cross Entropy loss gives different answer than Tensorflow's

I am implementing the Binary Cross-Entropy loss function with Raw python but it gives me a very different answer than Tensorflow.我正在使用原始 python 实现二进制交叉熵损失 function,但它给了我与 Tensorflow 截然不同的答案。 This is the answer I got from Tensorflow:-这是我从 Tensorflow 得到的答案:-

import numpy as np
from tensorflow.keras.losses import BinaryCrossentropy

y_true = np.array([1., 1., 1.])
y_pred = np.array([1., 1., 0.])
bce = BinaryCrossentropy()
loss = bce(y_true, y_pred)
print(loss.numpy())

Output: Output:

>>> 5.1416497230529785

From my Knowledge, the formula of Binary Cross entropy is this:据我所知,二元交叉熵的公式是这样的:

在此处输入图像描述

I implemented the same with raw python as follows:我用原始 python 实现了相同的功能,如下所示:

def BinaryCrossEntropy(y_true, y_pred):
    m = y_true.shape[1]
    y_pred = np.clip(y_pred, 1e-7, 1 - 1e-7)
    # Calculating loss
    loss = -1/m * (np.dot(y_true.T, np.log(y_pred)) + np.dot((1 - y_true).T, np.log(1 - y_pred)))

    return loss

print(BinaryCrossEntropy(np.array([1, 1, 1]).reshape(-1, 1), np.array([1, 1, 0]).reshape(-1, 1)))

But from this function I get loss value to be:但是从这个 function 我得到的损失值是:

>>> [[16.11809585]]

How can I get the right answer?我怎样才能得到正确的答案?

In the constructor of tf.keras.losses.BinaryCrossentropy() , you'll notice,tf.keras.losses.BinaryCrossentropy()的构造函数中,您会注意到,

tf.keras.losses.BinaryCrossentropy(
    from_logits=False, label_smoothing=0, reduction=losses_utils.ReductionV2.AUTO,
    name='binary_crossentropy'
)

The default argument reduction will most probably have the value Reduction.SUM_OVER_BATCH_SIZE , as mentioned here .如前所述默认参数reduction很可能具有值Reduction.SUM_OVER_BATCH_SIZE Assume that the shape of our model outputs is [ 1, 3 ] .假设我们的 model 输出的形状是[ 1, 3 ] Meaning, our batch size is 1 and the output dims is 3 ( This does not imply that there are 3 classes ).意思是,我们的批量大小是 1,output dims 是 3(这并不意味着有 3 个类)。 We need to compute the mean over the 0th axis ie the batch dimension.我们需要计算第 0 轴上的平均值,即批量维度。

I'll make it clear with the code,我会用代码说清楚,

import tensorflow as tf
import numpy as np

y_true = np.array( [1., 1., 1.] ).reshape( 1 , 3 )
y_pred = np.array( [1., 1., 0.] ).reshape( 1 , 3 )

bce = tf.keras.losses.BinaryCrossentropy( from_logits=False , reduction=tf.keras.losses.Reduction.SUM_OVER_BATCH_SIZE )
loss = bce( y_true, y_pred )

print(loss.numpy())

The output is, output 是,

5.1416497230529785

The expression for Binary Crossentropy is the same as mentioned in the question.二元交叉熵的表达式与问题中提到的相同。 N refers to the batch size. N 是指批量大小。

We now implement BCE on our own.我们现在自己实现 BCE。 First, we clip the outputs of our model, setting max to tf.keras.backend.epsilon() and min to 1 - tf.keras.backend.epsilon() .首先,我们裁剪 model 的输出,将max设置为tf.keras.backend.epsilon()并将min设置为1 - tf.keras.backend.epsilon() The value of tf.keras.backend.epsilon() is 1e-7. tf.keras.backend.epsilon()的值为 1e-7。

y_pred = np.clip( y_pred , tf.keras.backend.epsilon() , 1 - tf.keras.backend.epsilon() )

Using the expression for BCE,使用 BCE 的表达式,

p1 = y_true * np.log( y_pred + tf.keras.backend.epsilon() )
p2 = ( 1 - y_true ) * np.log( 1 - y_pred + tf.keras.backend.epsilon() )

print( p1 )
print( p2 )

The output, output,

[[  0.           0.         -15.42494847]]
[[-0. -0.  0.]]

Notice that the shapes are still preserved.请注意,形状仍然保留。 A np.dot will turn them into a array of two elements ie of shape [ 1, 2 ] ( As in your implementation ).一个np.dot会将它们变成一个由两个元素组成的数组,即形状为[ 1, 2 ] (与您的实现一样)。

Finally, we add them and compute their mean using np.mean() over the batch dimension,最后,我们将它们相加并使用np.mean()在批处理维度上计算它们的平均值,

o  = -np.mean( p1 + p2 )
print( o )

The output is, output 是,

5.141649490132791

You can check the problem in your implementation by printing the shape of each of the terms.您可以通过打印每个术语的shape来检查实现中的问题。

There's some issue with your implementation.您的实施存在一些问题。 Here is the correct one with numpy .这是正确的numpy

def BinaryCrossEntropy(y_true, y_pred):
    y_pred = np.clip(y_pred, 1e-7, 1 - 1e-7)
    term_0 = (1-y_true) * np.log(1-y_pred + 1e-7)
    term_1 = y_true * np.log(y_pred + 1e-7)
    return -np.mean(term_0+term_1, axis=0)

print(BinaryCrossEntropy(np.array([1, 1, 1]).reshape(-1, 1), 
                         np.array([1, 1, 0]).reshape(-1, 1)))
[5.14164949]

Note, during the tf. keras注意,在tf. keras tf. keras model training, it's better to use keras backend functionality. tf. keras model 训练,最好使用keras后端功能。 You can implement it, in the same way, using the keras backend utilities.您可以使用keras后端实用程序以同样的方式实现它。

def BinaryCrossEntropy(y_true, y_pred): 
    y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
    term_0 = (1 - y_true) * K.log(1 - y_pred + K.epsilon())  
    term_1 = y_true * K.log(y_pred + K.epsilon())
    return -K.mean(term_0 + term_1, axis=0)

print(BinaryCrossEntropy(
    np.array([1., 1., 1.]).reshape(-1, 1), 
    np.array([1., 1., 0.]).reshape(-1, 1)
    ).numpy())
[5.14164949]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 实现余弦相似度损失给出了与 Tensorflow 不同的答案 - Implementing Cosine similarity loss gives different answer than Tensorflow's Keras Tensorflow二元交叉熵损失大于1 - Keras Tensorflow Binary Cross entropy loss greater than 1 为什么softmax交叉熵损失在张量流中永远不会给出零值? - Why softmax cross entropy loss never gives a value of zero in tensorflow? 为什么使用tensorflow的估计器高级API和原始API的mnist分类的交叉熵损失在规模上不同? - why the cross entropy loss of mnist classification using tensorflow's estimator high-level API and raw API are different in scale? Tensorflow:具有交叉熵损失的加权稀疏softmax - Tensorflow: Weighted sparse softmax with cross entropy loss 如何在 TensorFlow 中选择交叉熵损失? - How to choose cross-entropy loss in TensorFlow? "这个加权二元交叉熵损失公式是否正确?" - Is this formula for weighted binary cross entropy loss correct? 在哪里使用二进制二进制交叉熵损失 - Where to use binary Binary Cross-Entropy Loss 用于图像分割的张量流的sigmoid_cross_entropy损失函数 - sigmoid_cross_entropy loss function from tensorflow for image segmentation Tensorflow:Sigmoid交叉熵损失不会强制网络输出为0或1 - Tensorflow: Sigmoid cross entropy loss does not force network outputs to be 0 or 1
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM