简体   繁体   English

为什么在 CNN 的 output 层中使用 softmax 而不是 sigmoid 时得到截然不同的结果?

[英]Why am I getting drastically different results when using softmax instead of sigmoid in the output layer in CNN?

I have a simple model which classifies images of triangles and circles.我有一个简单的 model 对三角形和圆形的图像进行分类。

Code:代码:

    model = Sequential([
    Conv2D(16, 3, padding='same', activation='relu', input_shape=(150, 150 ,3)),
    MaxPooling2D(),
    Conv2D(32, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(1,activation='sigmoid'),])
    model.compile(optimizer='adam',
                  loss='binary_crossentropy',
                  metrics=['accuracy'])

    model.fit(Xtr,ytr,epochs=3,batch_size=10)

The performance on the test set is:在测试集上的表现是:

在此处输入图像描述

But when I change the activation function in the output layer into softmax , ie the last layer turns into Dense(1,activation='softmax') , the model's performance becomes但是当我将 output 层中的激活 function 更改为softmax时,即最后一层变为Dense(1,activation='softmax') ,模型的性能变为

在此处输入图像描述

I made different dataset splits, the results remained roughly the same (the model with softmax activation performed equally badly).我进行了不同的数据集拆分,结果大致相同(使用 softmax 激活的 model 表现同样糟糕)。 What is the issue?问题是什么?

Using softmax, with your current configuration, is actually forcing it to choose always only one class.在您当前的配置下使用 softmax,实际上是在强制它始终只选择一个 class。 It might be the reason you get always recall equal to zero for one class and one for the other class in your experience using softmax.这可能是在您使用 softmax 的经验中,一个 class 和另一个 class 的召回率总是为零的原因。

First, you need to change the loss.首先,您需要更改损失。 The binary_crossentropy is not supposed to be used for softmax. binary_crossentropy不应该用于 softmax。 If you change the loss to categorical cross-entropy and make the DENSE of size 2 for the last layer( since you want to choose between two classes using softmax), you should get almost the same performance;如果您将损失更改为categorical cross-entropy并使最后一层的 DENSE 大小为 2(因为您想使用 softmax 在两个类之间进行选择),您应该获得几乎相同的性能; ie: change this part of the code;即:更改这部分代码;

Dense(1,activation='sigmoid'),])
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

with this one:有了这个:

Dense(2,activation='softmax'),])
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model = Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(150, 150 ,3)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(512, activation='relu'),
#Dense(1,activation='sigmoid'),]) #this means you have 1 node in last layer
                                  #that's why you have result of 1 class
Dense(2,activation='softmax'),])  #this is meaninful for 2 class. it will work on your code
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(Xtr,ytr,epochs=3,batch_size=10)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在隐藏层中使用 softmax,在 output 层中使用 relu 进行 CNN 回归 - Using softmax in hidden layer and relu in output layer for CNN regression 为什么 output 层的 sigmoid function 获得值 0 或 1 {0,1} 而不是在 [0,1] 中获得值 - Why output layer of sigmoid function gets value 0 or 1 {0,1} instead of getting value in [0,1] 为什么我使用不同的算法得到不同的引导结果? - Why am I getting different bootstrap results using different algorithms? 获取在softmax层之前的CNN的最后一层中获得的向量 - Getting vector obtained in the last layer of CNN before softmax layer 为什么在使用带有asyncio的协同程序的列表解析时会得到不同的结果? - Why am I getting different results when using a list comprehension with coroutines with asyncio? 反向传播以进行S型激活和softmax输出 - Backpropagation for sigmoid activation and softmax output CNN:为什么我通过logit或softmax层来测量准确性没有区别? - CNN: why it is making no difference whether i measure accuracy by logits or softmax layer? 当我在Python控制台中运行此代码时,为什么会得到不同的结果? - Why I am getting different results when I run this code in my Python console? 为什么我在输出中得到 a,b,c 而不是实际值? - Why am i getting a,b,c instead of actual values in output? softmax 和 sigmoid 在多类分类中给出相同的结果 - softmax and sigmoid are giving same results in multiclass classification
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM