简体   繁体   English

为什么model.fit()使用categorical_crossentropy损失函数通过tf.train.AdamOptimizer引发ValueError?

[英]Why does model.fit() raise ValueError with tf.train.AdamOptimizer using categorical_crossentropy loss function?

I'm following the TensorFlow basic classification example with the Keras API provided in the "Getting Started" docs. 我正在使用“入门”文档中提供的带有Keras API的TensorFlow基本分类示例 I get through the tutorial as-is just fine, but if I change the loss function from sparse_categorical_crossentropy to categorical_crossentropy , the code below: 我按原样完成了本教程,但是如果我将损失函数从sparse_categorical_crossentropycategorical_crossentropy ,则下面的代码:

model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation=tf.nn.relu),
    keras.layers.Dense(10, activation=tf.nn.softmax)
])

model.compile(optimizer=tf.train.AdamOptimizer(), 
          loss='categorical_crossentropy',
          metrics=['accuracy'])

model.fit(train_images, train_labels, epochs=5)

fails during the training/fitting step with the following error: 在训练/拟合步骤中失败,并出现以下错误:

ValueError: Error when checking target: expected dense_1 to have shape (10,) but got array with shape (1,)

The documentation on the loss functions doesn't delve much into expected input and output. 损失函数的文档并没有深入研究预期的输入和输出。 Obviously there is a dimensionality issue here, but if any experts can give a detailed explanation, what is it about this loss function or any other loss function that raises this ValueError ? 显然这里存在维度问题,但是如果有任何专家可以给出详细的解释,那么此损失函数或引发此ValueError任何其他损失函数又是什么呢?

sparse_categorical_crossentropy loss expects the provided labels to be integers like 0, 1, 2 and so on, where each integer indicates a particular class. sparse_categorical_crossentropy损失期望提供的标签为整数,如0、1、2等,其中每个整数表示特定的类别。 For example class 0 might be dogs, class 1 might be cats and class 2 might be lions. 例如,类别0可能是狗,类别1可能是猫,而类别2可能是狮子。 On the other hand, categorical_crossentropy loss takes one-hot encoded labels such as [1,0,0] , [0,1,0] , [0,0,1] and they are interpreted such that the index of 1 indicates the class of the sample. 另一方面, categorical_crossentropy损失采用诸如[1,0,0][0,1,0][0,0,1] categorical_crossentropy 单编码编码标签,并对其进行解释以使索引1指示样本类别。 For example [0,0,1] means this sample belongs to class 2 (ie lions). 例如[0,0,1]表示此样本属于2类(即狮子)。 Further, in the context of classification models, since the output is usually a probability distribution produced by the output of softmax layer, this form of labels also corresponds to a probability distribution and match with the output of the model. 此外,在分类模型的上下文中,由于输出通常是softmax层的输出产生的概率分布,因此这种形式的标签也对应于概率分布,并与模型的输出匹配。 Again, [0,0,1] means that with probability of one we know that this sample belongs to class two. 同样, [0,0,1]表示我们以1的概率知道该样本属于第二类。

sparse_categorical_crossentropy is almost a convenient way to use categorical_crossentropy as the loss function where Keras (or its backend) would handle the integer labels internally and you don't need to manually convert labels to one-hot encoded form. sparse_categorical_crossentropy几乎是使用categorical_crossentropy作为损失函数的便捷方法,其中Keras(或其后端)将在内部处理整数标签,并且您无需手动将标签转换为单编码格式。 However, if the labels you provide are one-hot encoded then you must use categorical_crossentropy as the loss function. 但是,如果您提供的标签是一键编码的,则必须使用categorical_crossentropy作为损失函数。

Also you might be interested to look at this answer as well, where I have explained briefly about the activation and loss functions and the format of labels used in the context of different kinds of classification tasks. 另外,您可能也有兴趣查看此答案 ,在这里我简要介绍了激活和丢失功能以及在各种分类任务中使用的标签格式。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 `categorical_crossentropy` 损失中的 ValueError function:形状问题 - ValueError in `categorical_crossentropy` loss function: shape issue 损失为 NaN,使用激活 softmax 和损失 function categorical_crossentropy - Loss is NaN using activation softmax and loss function categorical_crossentropy TensorFlow 'categorical_crossentropy' 中的 ValueError - ValueError in TensorFlow 'categorical_crossentropy' ValueError: 形状为 (72148, 23) 的目标数组被传递给形状为 (None, 826, 23) 的 output,同时用作损失`categorical_crossentropy` - ValueError: A target array with shape (72148, 23) was passed for an output of shape (None, 826, 23) while using as loss `categorical_crossentropy` Keras Categorical_crossentropy 损失实现 - Keras Categorical_crossentropy loss implementation 在 tf.train.AdamOptimizer 中手动更改 learning_rate - Manually changing learning_rate in tf.train.AdamOptimizer 为什么 keras model 如果与 one-hot 标签和 categorical_crossentropy 和 softmax output 一起使用,则将所有预测为 1 - Why does keras model predicts all as ones if used with one-hot labels and categorical_crossentropy amnd softmax output 使用 categorical_crossentropy 时出错 - Error while using categorical_crossentropy 对图像序列使用 categorical_crossentropy - Using categorical_crossentropy for a sequence of images 形状为 (32, 3) 的目标数组被传递给形状为 (None, 15, 15, 3) 的 output,同时用作损失`categorical_crossentropy` - A target array with shape (32, 3) was passed for an output of shape (None, 15, 15, 3) while using as loss `categorical_crossentropy`
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM