简体   繁体   中英

SparseCategoricalCrossentropy Shape Mismatch

I want to do a simple test of the SparseCategoricalCrossentropy function, to see what exactly it does to an output. For that I use the output of the last layer of a MobileNetV2.

    import keras.backend as K

    full_model = tf.keras.applications.MobileNetV2(
    input_shape=(224,224,3),
    alpha=1.0,
    include_top=True,
    weights="imagenet",
    input_tensor=None,
    pooling=None,
    classes=1000,
    classifier_activation="softmax",)

    func = K.function(full_model.layers[1].input, full_model.layers[155].output)
    conv_output = func([processed_image])
    y_pred = np.single(conv_output)
    
    y_true = np.zeros(1000).reshape(1,1000)
    y_true[0][282] = 1
    
    scce = tf.keras.losses.SparseCategoricalCrossentropy()
    scce(y_true, y_pred).numpy()

processed_image is a 1x224x224x3 array created previously.

I'm getting the error ValueError: Shape mismatch: The shape of labels (received (1000,)) should equal the shape of logits except for the last dimension (received (1, 1000)).

I tried reshaping the arrays to match the dimensions the error mentioned, but it doesn't seem to work. What shapes does it accept?

Since you are using the SparseCategoricalCrossentropy loss function, the shape of y_true should be [batch_size] and the shape of y_pred should be [batch_size, num_classes] . Furthermore, y_true should consist of integer values. See the documentation . In your concrete example, you could try something like this:

import keras.backend as K
import tensorflow as tf
import numpy as np

full_model = tf.keras.applications.MobileNetV2(
             input_shape=(224,224,3),
             alpha=1.0,
             include_top=True,
             weights="imagenet",
             input_tensor=None,
             pooling=None,
             classes=1000,
             classifier_activation="softmax",)

batch_size = 1
processed_image = tf.random.uniform(shape=[batch_size,224,224,3])
func = K.function(full_model.layers[1].input, 
full_model.layers[155].output)
conv_output = func([processed_image])
y_pred = np.single(conv_output)

# Generates an integer between 0 and 999 representing a class index.
y_true = np.random.randint(low = 0, high = 999, size = batch_size)
# [984]
scce = tf.keras.losses.SparseCategoricalCrossentropy() 
scce(y_true, y_pred).numpy()
# y_pred encodes a probability distribution here and the calculated loss is 10.69202

You can experiment with the batch_size to see how everything works. In the example above, I just used a batch_size of 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM