简体   繁体   English

当我们使用实值训练目标时,为什么使用二进制交叉熵损失函数进行的神经网络训练会停滞不前?

[英]Why the training of a neural network using binary cross-entropy loss function gets stuck when we use real-valued training targets?

Assume that we have a binary classification problem, in which the training targets are not in {0,1} but in [0,1]. 假设我们有一个二进制分类问题,其中训练目标不在{0,1}中,而在[0,1]中。 We use the following code to train a simple classifier in Keras: 我们使用以下代码在Keras中训练一个简单的分类器:

model = Sequential()
model.add(Dense(100, input_shape=(X.shape[1],), activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='rmsprop')
model.fit(X,y)

If we pass the real training targets (in [0,1]), the training hardly proceeds, getting stuck around its initial loss value; 如果我们通过实际的训练目标(在[0,1]中),则训练几乎不会继续进行,陷入其初始损失值附近; but if we quantize the targets in {0,1} it performs better, rapidly decreases the training loss. 但是,如果我们量化{0,1}中的目标,则效果会更好,可以迅速减少训练损失。

Is this a normal phenomena? 这是正常现象吗? What is its reason? 是什么原因

Edit: Here is the reproducible experiment. 编辑: 是可重复的实验。 And this is the obtained plot: 这是获得的图:

在此处输入图片说明

You state that you want to solve a binary classification task, for which the target should be binary -valued, ie {0,1}. 您声明要解决二进制分类任务,目标应为二进制值,即{0,1}。

However, if your target instead is some float value in [0,1], you are actually trying to perform regression . 但是,如果您的目标是[0,1]中的某个浮点值,则实际上是在尝试执行回归

This, amongst others, changes the requirements for your loss function. 除其他外,这改变了对损失功能的要求。 See Tensorflow Cross Entropy for Regression? 看到Tensorflow交叉熵进行回归? , where the usage of cross entropy loss for regression is discussed in more detail. ,其中更详细地讨论了交叉熵损失用于回归的用法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果我使用许多关联的网络,为什么我的交叉熵损失函数会变得很大? - Why does my cross-entropy loss function get huge if I use a network of many relus? 在哪里使用二进制二进制交叉熵损失 - Where to use binary Binary Cross-Entropy Loss 在损失函数中结合交叉熵和 mse - Combine cross-entropy and mse in loss function 为什么 tf model 训练期间的二元交叉熵损失与 sklearn 计算的不同? - Why is the binary cross entropy loss during training of tf model different than that calculated by sklearn? 训练神经网络时将损失值设为0 - Getting loss value as 0 when training a neural network 训练神经网络时损失增加 - Loss increases when training neural network 神经网络在训练时损失了 Nan - Neural Network Gives a Loss of Nan When Training 如何计算 sigmoid 神经网络二元结果的交叉熵? - How can I calculate cross-entropy on a sigmoid neural network binary outcome? 为什么多类语义分割训练中的分类交叉熵损失 function model 非常高? - why categorical cross entropy loss function in training unet model for multiclass semantic segmentation is very high? 如何在 PyTorch 中使用 Real-World-Weight 交叉熵损失 - How to use Real-World-Weight Cross-Entropy loss in PyTorch
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM