为什么 keras model 如果与 one-hot 标签和 categorical_crossentropy 和 softmax output 一起使用，则将所有预测为 1

Question

我有一个简单的 tf.keras model：

inputs = keras.Input(shape=(9824,))
dense = layers.Dense(512, activation=keras.activations.relu, kernel_initializer=init)
x = dense(inputs)
x = layers.Dense(512, activation=keras.activations.relu)(x)
outputs = layers.Dense(3, activation=keras.activations.softmax)(x)
model = keras.Model(inputs=inputs, outputs=outputs)

当我用稀疏的分类交叉熵和实际标签编译它时，它按预期工作。 但是，当我尝试对标签进行一次热编码（使用tf.keras.utils.to_categorical ）并使用 categorical_crossentropy （因此我可以在训练期间使用召回率和精度作为指标）时，model 将所有内容预测为：

>>>print(predictions)
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 ...
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]

如果我理解正确，output 层中的 softmax 激活应该会导致 output 在范围内（0,1）并且总和为 1。那么， ZA2F2ED4F8EBC2CBB4C21A29DC40AB6 怎么可能是所有预测？ 我一直在寻找答案几个小时，但无济于事。

编辑

这是一个简约的例子。

我忘了说我用的是scikeras package。 根据文档中的示例，我假设 model 是隐式编译的。 这是分类器构造函数：

clf = KerasClassifier(
    model=keras_model_target,
    loss=SparseCategoricalCrossentropy(),
    name="model_target",
    optimizer=Adam(),
    init=GlorotUniform(),
    metrics=[SparseCategoricalAccuracy()],
    epochs=5,
    batch_size=128
)

我适合 model

result = clf.fit(x_train, y_train)

并预测：

predictions = clf.predict(x)

Answer 1

这是 SciKeras 中的一个错误，已在 v0.3.1 版本中修复。 更新到最新版本应该可以解决问题。

至于错误本身，这是由于我们如何索引 numpy arrays，请参阅此差异了解详细信息。

为什么 keras model 如果与 one-hot 标签和 categorical_crossentropy 和 softmax output 一起使用，则将所有预测为 1

问题描述

编辑

1 个解决方案

解决方案1
0 已采纳 2021-05-16 16:10:02

为什么 keras model 如果与 one-hot 标签和 categorical_crossentropy 和 softmax output 一起使用，则将所有预测为 1

问题描述

编辑

1 个解决方案

解决方案1 0 已采纳 2021-05-16 16:10:02

解决方案1
0 已采纳 2021-05-16 16:10:02