[英]Why am I getting drastically different results when using softmax instead of sigmoid in the output layer in CNN?
I have a simple model which classifies images of triangles and circles.我有一个简单的 model 对三角形和圆形的图像进行分类。
Code:代码:
model = Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(150, 150 ,3)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(512, activation='relu'),
Dense(1,activation='sigmoid'),])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(Xtr,ytr,epochs=3,batch_size=10)
The performance on the test set is:在测试集上的表现是:
But when I change the activation function in the output layer into softmax
, ie the last layer turns into Dense(1,activation='softmax')
, the model's performance becomes但是当我将 output 层中的激活 function 更改为
softmax
时,即最后一层变为Dense(1,activation='softmax')
,模型的性能变为
I made different dataset splits, the results remained roughly the same (the model with softmax activation performed equally badly).我进行了不同的数据集拆分,结果大致相同(使用 softmax 激活的 model 表现同样糟糕)。 What is the issue?
问题是什么?
Using softmax, with your current configuration, is actually forcing it to choose always only one class.在您当前的配置下使用 softmax,实际上是在强制它始终只选择一个 class。 It might be the reason you get always recall equal to zero for one class and one for the other class in your experience using softmax.
这可能是在您使用 softmax 的经验中,一个 class 和另一个 class 的召回率总是为零的原因。
First, you need to change the loss.首先,您需要更改损失。 The
binary_crossentropy
is not supposed to be used for softmax. binary_crossentropy
不应该用于 softmax。 If you change the loss to categorical cross-entropy
and make the DENSE of size 2 for the last layer( since you want to choose between two classes using softmax), you should get almost the same performance;如果您将损失更改为
categorical cross-entropy
并使最后一层的 DENSE 大小为 2(因为您想使用 softmax 在两个类之间进行选择),您应该获得几乎相同的性能; ie: change this part of the code;即:更改这部分代码;
Dense(1,activation='sigmoid'),])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
with this one:有了这个:
Dense(2,activation='softmax'),])
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
model = Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(150, 150 ,3)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(512, activation='relu'),
#Dense(1,activation='sigmoid'),]) #this means you have 1 node in last layer
#that's why you have result of 1 class
Dense(2,activation='softmax'),]) #this is meaninful for 2 class. it will work on your code
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(Xtr,ytr,epochs=3,batch_size=10)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.