Significance of loss in classification with Keras

Question

I am training a three layer neural network with keras:

    model = models.Sequential()
    model.add(Conv2D(32, (3, 3), padding="same",
                     input_shape=input_shape, strides=2, kernel_regularizer=l2(reg)))

    model.add(BatchNormalization(axis=channels))
    model.add(Activation("relu"))
    model.add(Conv2D(64, (3, 3), padding="same",
                     input_shape=input_shape, strides=2, kernel_regularizer=l2(reg)))

    model.add(BatchNormalization(axis=channels))
    model.add(Activation("relu"))
    model.add(Conv2D(128, (3, 3), padding="same",
                     input_shape=input_shape, strides=2, kernel_regularizer=l2(reg)))

    model.add(BatchNormalization(axis=channels))
    model.add(Activation("relu"))
    model.add(layers.Flatten())
    model.add(layers.Dense(neurons, activation='relu', kernel_regularizer=l2(reg)))
    model.add(Dropout(0.50))
    model.add(Dense(2))
    model.add(Activation("softmax"))

My data has two classes, and I am using sparse categorical cross entropy:

 model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
 history = model.fit(x=X, y=y, batch_size=batch_size, epochs=epochs, validation_data=(X_val, y_val),
                            shuffle=True,
                            callbacks=callbacks,
                            verbose=1)

My data has the following shape:

X: (232, 100, 150, 3)
y: (232,)

Where X are images and y is either 1 or 0, because of using the sparse loss function

The loss is very high for both accuracy and validation, even if the training accuracy is 1, I get values over 20 for the loss. which I understand are not reasonable.

If I set the model to try for a few epochs and output the predictions for the labels and the true values, and I get the categorical cross entropy from them, the value I get is <1, as expected, even when I make the calculation with keras' function (I change to categorical because the sparse gives an error)

21/21 [==============================] - 7s 313ms/step - loss: 44.1764 - acc: 1.0000 - val_loss: 44.7084 - val_acc: 0.7857 

cce = tf.keras.losses.CategoricalCrossentropy()

    pred = model.predict(x=X_val, batch_size=len(X_val))
    loss = cce(true_categorical, pred)
    Categorical loss 0.6077293753623962

Is there a way to know exactly how this is calculated and why the high values? Batch size is 8.

Answer 1

The loss printed by Keras is the total loss. Regularization is also a loss added to the model based on the value of the weights.

Since you have a lot of weights, you also have a lot of contributions to the total loss.

That is why it's big. If you remove the regularization, you will see the final loss equal to the categorical crossentropy loss.

Significance of loss in classification with Keras

Question

1 answers

solution1
3 ACCPTED 2019-10-31 19:12:13

Significance of loss in classification with Keras

Question

1 answers

solution1 3 ACCPTED 2019-10-31 19:12:13

solution1
3 ACCPTED 2019-10-31 19:12:13