[英]Why does model.fit() raise ValueError with tf.train.AdamOptimizer using categorical_crossentropy loss function?
I'm following the TensorFlow basic classification example with the Keras API provided in the "Getting Started" docs. 我正在使用“入门”文档中提供的带有Keras API的TensorFlow基本分类示例 。 I get through the tutorial as-is just fine, but if I change the loss function from
sparse_categorical_crossentropy
to categorical_crossentropy
, the code below: 我按原样完成了本教程,但是如果我将损失函数从
sparse_categorical_crossentropy
为categorical_crossentropy
,则下面的代码:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5)
fails during the training/fitting step with the following error: 在训练/拟合步骤中失败,并出现以下错误:
ValueError: Error when checking target: expected dense_1 to have shape (10,) but got array with shape (1,)
The documentation on the loss functions doesn't delve much into expected input and output. 损失函数的文档并没有深入研究预期的输入和输出。 Obviously there is a dimensionality issue here, but if any experts can give a detailed explanation, what is it about this loss function or any other loss function that raises this
ValueError
? 显然这里存在维度问题,但是如果有任何专家可以给出详细的解释,那么此损失函数或引发此
ValueError
任何其他损失函数又是什么呢?
sparse_categorical_crossentropy
loss expects the provided labels to be integers like 0, 1, 2 and so on, where each integer indicates a particular class. sparse_categorical_crossentropy
损失期望提供的标签为整数,如0、1、2等,其中每个整数表示特定的类别。 For example class 0 might be dogs, class 1 might be cats and class 2 might be lions. 例如,类别0可能是狗,类别1可能是猫,而类别2可能是狮子。 On the other hand,
categorical_crossentropy
loss takes one-hot encoded labels such as [1,0,0]
, [0,1,0]
, [0,0,1]
and they are interpreted such that the index of 1 indicates the class of the sample. 另一方面,
categorical_crossentropy
损失采用诸如[1,0,0]
, [0,1,0]
, [0,0,1]
categorical_crossentropy
单编码编码标签,并对其进行解释以使索引1指示样本类别。 For example [0,0,1]
means this sample belongs to class 2 (ie lions). 例如
[0,0,1]
表示此样本属于2类(即狮子)。 Further, in the context of classification models, since the output is usually a probability distribution produced by the output of softmax layer, this form of labels also corresponds to a probability distribution and match with the output of the model. 此外,在分类模型的上下文中,由于输出通常是softmax层的输出产生的概率分布,因此这种形式的标签也对应于概率分布,并与模型的输出匹配。 Again,
[0,0,1]
means that with probability of one we know that this sample belongs to class two. 同样,
[0,0,1]
表示我们以1的概率知道该样本属于第二类。
sparse_categorical_crossentropy
is almost a convenient way to use categorical_crossentropy
as the loss function where Keras (or its backend) would handle the integer labels internally and you don't need to manually convert labels to one-hot encoded form. sparse_categorical_crossentropy
几乎是使用categorical_crossentropy
作为损失函数的便捷方法,其中Keras(或其后端)将在内部处理整数标签,并且您无需手动将标签转换为单编码格式。 However, if the labels you provide are one-hot encoded then you must use categorical_crossentropy
as the loss function. 但是,如果您提供的标签是一键编码的,则必须使用
categorical_crossentropy
作为损失函数。
Also you might be interested to look at this answer as well, where I have explained briefly about the activation and loss functions and the format of labels used in the context of different kinds of classification tasks. 另外,您可能也有兴趣查看此答案 ,在这里我简要介绍了激活和丢失功能以及在各种分类任务中使用的标签格式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.