为什么model.fit（）使用categorical_crossentropy损失函数通过tf.train.AdamOptimizer引发ValueError？

Question

I'm following the TensorFlow basic classification example with the Keras API provided in the "Getting Started" docs. 我正在使用“入门”文档中提供的带有Keras API的TensorFlow基本分类示例。 I get through the tutorial as-is just fine, but if I change the loss function from sparse_categorical_crossentropy to categorical_crossentropy , the code below: 我按原样完成了本教程，但是如果我将损失函数从sparse_categorical_crossentropy为categorical_crossentropy ，则下面的代码：

model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation=tf.nn.relu),
    keras.layers.Dense(10, activation=tf.nn.softmax)
])

model.compile(optimizer=tf.train.AdamOptimizer(), 
          loss='categorical_crossentropy',
          metrics=['accuracy'])

model.fit(train_images, train_labels, epochs=5)

fails during the training/fitting step with the following error: 在训练/拟合步骤中失败，并出现以下错误：

ValueError: Error when checking target: expected dense_1 to have shape (10,) but got array with shape (1,)

The documentation on the loss functions doesn't delve much into expected input and output. 损失函数的文档并没有深入研究预期的输入和输出。 Obviously there is a dimensionality issue here, but if any experts can give a detailed explanation, what is it about this loss function or any other loss function that raises this ValueError ? 显然这里存在维度问题，但是如果有任何专家可以给出详细的解释，那么此损失函数或引发此ValueError任何其他损失函数又是什么呢？

Answer 1

sparse_categorical_crossentropy loss expects the provided labels to be integers like 0, 1, 2 and so on, where each integer indicates a particular class. sparse_categorical_crossentropy损失期望提供的标签为整数，如0、1、2等，其中每个整数表示特定的类别。 For example class 0 might be dogs, class 1 might be cats and class 2 might be lions. 例如，类别0可能是狗，类别1可能是猫，而类别2可能是狮子。 On the other hand, categorical_crossentropy loss takes one-hot encoded labels such as [1,0,0] , [0,1,0] , [0,0,1] and they are interpreted such that the index of 1 indicates the class of the sample. 另一方面， categorical_crossentropy损失采用诸如[1,0,0] ， [0,1,0] ， [0,0,1] categorical_crossentropy 单编码编码标签，并对其进行解释以使索引1指示样本类别。 For example [0,0,1] means this sample belongs to class 2 (ie lions). 例如[0,0,1]表示此样本属于2类（即狮子）。 Further, in the context of classification models, since the output is usually a probability distribution produced by the output of softmax layer, this form of labels also corresponds to a probability distribution and match with the output of the model. 此外，在分类模型的上下文中，由于输出通常是softmax层的输出产生的概率分布，因此这种形式的标签也对应于概率分布，并与模型的输出匹配。 Again, [0,0,1] means that with probability of one we know that this sample belongs to class two. 同样， [0,0,1]表示我们以1的概率知道该样本属于第二类。

sparse_categorical_crossentropy is almost a convenient way to use categorical_crossentropy as the loss function where Keras (or its backend) would handle the integer labels internally and you don't need to manually convert labels to one-hot encoded form. sparse_categorical_crossentropy几乎是使用categorical_crossentropy作为损失函数的便捷方法，其中Keras（或其后端）将在内部处理整数标签，并且您无需手动将标签转换为单编码格式。 However, if the labels you provide are one-hot encoded then you must use categorical_crossentropy as the loss function. 但是，如果您提供的标签是一键编码的，则必须使用categorical_crossentropy作为损失函数。

Also you might be interested to look at this answer as well, where I have explained briefly about the activation and loss functions and the format of labels used in the context of different kinds of classification tasks. 另外，您可能也有兴趣查看此答案，在这里我简要介绍了激活和丢失功能以及在各种分类任务中使用的标签格式。

为什么model.fit（）使用categorical_crossentropy损失函数通过tf.train.AdamOptimizer引发ValueError？

问题描述

1 个解决方案

解决方案1
4 已采纳 2018-11-13 06:49:22

为什么model.fit（）使用categorical_crossentropy损失函数通过tf.train.AdamOptimizer引发ValueError？

问题描述

1 个解决方案

解决方案1 4 已采纳 2018-11-13 06:49:22

解决方案1
4 已采纳 2018-11-13 06:49:22