TensorFlow2 - Model 子类化 ValueError

Question

I am trying to experiment creating a LeNet-300-100 dense neural network using TensorFlow 2's model sub-classing.我正在尝试使用 TensorFlow 2 的 model 子分类创建 LeNet-300-100 密集神经网络。 The code that I have is as follows:我拥有的代码如下：

batch_size = 32
num_epochs = 20


# Load MNIST dataset-
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()

X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# Convert class vectors/target to binary class matrices or one-hot encoded values-
y_train = tf.keras.utils.to_categorical(y_train, num_classes)
y_test = tf.keras.utils.to_categorical(y_test, num_classes)

X_train.shape, y_train.shape
# ((60000, 28, 28), (60000, 10))

X_test.shape, y_test.shape
# ((10000, 28, 28), (10000, 10)) 




class LeNet300(Model):
    def __init__(self, **kwargs):
        super(LeNet300, self).__init__(**kwargs)
        
        self.flatten = Flatten()
        self.dense1 = Dense(units = 300, activation = 'relu')
        self.dense2 = Dense(units = 100, activation = 'relu')
        self.op = Dense(units = 10, activation = 'softmax')

    def call(self, inputs):
        x = self.flatten(inputs)
        x = self.dense1(x)
        x = self.dense2(x)
        return self.op(x)




# Instantiate an object using LeNet-300-100 dense model-
model = LeNet300()

# Compile the defined model-
model.compile(
        optimizer=tf.keras.optimizers.Adam(),
        loss=tf.keras.losses.SparseCategoricalCrossentropy(),
        metrics=['accuracy']
        )


# Define early stopping callback-
early_stopping_callback = tf.keras.callbacks.EarlyStopping(
        monitor = 'val_loss', min_delta = 0.001,
        patience = 3)

# Train defined and compiled model-
history = model.fit(
    x = X_train, y = y_train,
    batch_size = batch_size, shuffle = True,
    epochs = num_epochs,
    callbacks = [early_stopping_callback],
    validation_data = (X_test, y_test)
    )

On calling "model.fit()", it gives the following error:在调用“model.fit()”时，会出现以下错误：

ValueError: Shape mismatch: The shape of labels (received (320,)) should equal the shape of logits except for the last dimension (received (32, 10)). ValueError：形状不匹配：标签的形状（收到的 (320,)）应该等于 logits 的形状，除了最后一个维度（收到的 (32, 10)）。

What's going wrong?怎么了？

Thanks谢谢

Answer 1

The loss SparseCategoricalCrossentropy doesn't take one-hot encoding to calculate loss.损失SparseCategoricalCrossentropy不采用 one-hot encoding 来计算损失。 In the documentation, they mention that在文档中，他们提到

Use this crossentropy loss function when there are two or more label classes.当有两个或更多 label 类时，使用此交叉熵损失 function。 We expect labels to be provided as integers.我们希望标签以整数形式提供。 If you want to provide labels using one-hot representation, please use CategoricalCrossentropy loss.如果您想使用 one-hot 表示提供标签，请使用 CategoricalCrossentropy 损失。 There should be # classes floating point values per feature for y_pred and a single floating point value per feature for y_true.对于 y_pred，每个特征应该有 # 个类浮点值，对于 y_true，每个特征应该有一个浮点值。

As a result of this you are getting the error.因此，您会收到错误消息。 If you observe the stacktrace the error arises in the loss function,如果您观察堆栈跟踪，则会在丢失 function 中出现错误，

    /home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/keras/losses.py:1569 sparse_categorical_crossentropy
        y_true, y_pred, from_logits=from_logits, axis=axis)
    /home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201 wrapper
        return target(*args, **kwargs)
    /home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/keras/backend.py:4941 sparse_categorical_crossentropy
        labels=target, logits=output)
    /home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201 wrapper
        return target(*args, **kwargs)
    /home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py:4241 sparse_softmax_cross_entropy_with_logits_v2
        labels=labels, logits=logits, name=name)
    /home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201 wrapper
        return target(*args, **kwargs)
    /home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py:4156 sparse_softmax_cross_entropy_with_logits
        logits.get_shape()))

    ValueError: Shape mismatch: The shape of labels (received (320,)) should equal the shape of logits except for the last dimension (received (32, 10)).

I would suggest using CategoricalCrossentropy .我建议使用CategoricalCrossentropy 。

Answer 2

This is because input to first Dense layer should be flattened.这是因为第一个密集层的输入应该被展平。 MNIST data has 28x28 grid/image for every digit. MNIST 数据的每个数字都有 28x28 的网格/图像。 This 28x28 data should be flattened to 784 input numbers.这个 28x28 的数据应该被展平为 784 个输入数字。

So just before first Dense(...) layer insert Flatten() keras layer ie do Flatten()(inputs) .所以就在第一个Dense(...)层之前插入Flatten() keras 层，即做Flatten()(inputs) 。

See this doc of Flatten layer for reference.请参阅此 Flatten 图层文档以供参考。

TensorFlow2 - Model 子类化 ValueError

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-12-18 15:38:18

解决方案2
0 2020-12-18 12:42:24

TensorFlow2 - Model 子类化 ValueError

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-12-18 15:38:18

解决方案2 0 2020-12-18 12:42:24

解决方案1
1 已采纳 2020-12-18 15:38:18

解决方案2
0 2020-12-18 12:42:24