当我们使用实值训练目标时，为什么使用二进制交叉熵损失函数进行的神经网络训练会停滞不前？

Question

Assume that we have a binary classification problem, in which the training targets are not in {0,1} but in [0,1]. 假设我们有一个二进制分类问题，其中训练目标不在{0,1}中，而在[0,1]中。 We use the following code to train a simple classifier in Keras: 我们使用以下代码在Keras中训练一个简单的分类器：

model = Sequential()
model.add(Dense(100, input_shape=(X.shape[1],), activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='rmsprop')
model.fit(X,y)

If we pass the real training targets (in [0,1]), the training hardly proceeds, getting stuck around its initial loss value; 如果我们通过实际的训练目标（在[0,1]中），则训练几乎不会继续进行，陷入其初始损失值附近； but if we quantize the targets in {0,1} it performs better, rapidly decreases the training loss. 但是，如果我们量化{0,1}中的目标，则效果会更好，可以迅速减少训练损失。

Is this a normal phenomena? 这是正常现象吗？ What is its reason? 是什么原因

Edit: Here is the reproducible experiment. 编辑：这是可重复的实验。 And this is the obtained plot: 这是获得的图：

Answer 1

You state that you want to solve a binary classification task, for which the target should be binary -valued, ie {0,1}. 您声明要解决二进制分类任务，目标应为二进制值，即{0,1}。

However, if your target instead is some float value in [0,1], you are actually trying to perform regression . 但是，如果您的目标是[0,1]中的某个浮点值，则实际上是在尝试执行回归。

This, amongst others, changes the requirements for your loss function. 除其他外，这改变了对损失功能的要求。 See Tensorflow Cross Entropy for Regression? 看到Tensorflow交叉熵进行回归？ , where the usage of cross entropy loss for regression is discussed in more detail. ，其中更详细地讨论了交叉熵损失用于回归的用法。

当我们使用实值训练目标时，为什么使用二进制交叉熵损失函数进行的神经网络训练会停滞不前？

问题描述

1 个解决方案

解决方案1
0 2018-11-20 13:17:25

当我们使用实值训练目标时，为什么使用二进制交叉熵损失函数进行的神经网络训练会停滞不前？

问题描述

1 个解决方案

解决方案1 0 2018-11-20 13:17:25

解决方案1
0 2018-11-20 13:17:25