Why the training of a neural network using binary cross-entropy loss function gets stuck when we use real-valued training targets?

Question

Assume that we have a binary classification problem, in which the training targets are not in {0,1} but in [0,1]. We use the following code to train a simple classifier in Keras:

model = Sequential()
model.add(Dense(100, input_shape=(X.shape[1],), activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='rmsprop')
model.fit(X,y)

If we pass the real training targets (in [0,1]), the training hardly proceeds, getting stuck around its initial loss value; but if we quantize the targets in {0,1} it performs better, rapidly decreases the training loss.

Is this a normal phenomena? What is its reason?

Edit: Here is the reproducible experiment. And this is the obtained plot:

Answer 1

You state that you want to solve a binary classification task, for which the target should be binary -valued, ie {0,1}.

However, if your target instead is some float value in [0,1], you are actually trying to perform regression .

This, amongst others, changes the requirements for your loss function. See Tensorflow Cross Entropy for Regression? , where the usage of cross entropy loss for regression is discussed in more detail.

Why the training of a neural network using binary cross-entropy loss function gets stuck when we use real-valued training targets?

Question

1 answers

solution1
0 2018-11-20 13:17:25

Why the training of a neural network using binary cross-entropy loss function gets stuck when we use real-valued training targets?

Question

1 answers

solution1 0 2018-11-20 13:17:25

solution1
0 2018-11-20 13:17:25