Keras multi-class classifier probabilities are 0 and 1 valued

Question

I am implementing a classifier to identify 3 different types of images, my last layer has 3 neurons with sigmoid activation

from keras.model import Sequential
from keras.layers import Conv2D, MaxPooling2D, BatchNormalization, Dropout, Flatten, Dense

model = Sequential()

model.add(Conv2D(16, kernel_size=(3, 3), activation='relu', 
                 input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.2))

# more conv layers

model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(3, activation='sigmoid'))

model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(), metrics=['accuracy'])

The training set label uses one hot encoding, and there are abundant training examples for each of the 3 categories.

But when I run model.predict(X) on the test set, the first 10 output is

[[0. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [0. 1. 1.]
 [1. 1. 1.]
 [0. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]

model.predict() should output probabilities and each row should sum up to 1, but in the actual results, sometimes every category has a probability of 1. Does anyone know why the probabilities come up this way?

Answer 1

Use softmax in the last layer instead of sigmoid .

model.add(Dense(3, activation='softmax'))

For multi-class classification, to get probability, softmax is needed.

Answer 2

The problem is in your output layer. your problem is a MULTICLASS so you have to deal it in the correct way. These are the possibilities:

if you have 1D integer encoded target you can use sparse_categorical_crossentropy as loss function and softmax as final activation

X = np.random.randint(0,10, (1000,100))
y = np.random.randint(0,3, 1000)

model = Sequential([
    Dense(128, input_dim = 100),
    Dense(3, activation='softmax'),
])
model.summary()
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)

Otherwise, if you have one-hot encoded your target in order to have 2D shape (n_samples, n_class) you can use categorical_crossentropy and softmax as final activation

X = np.random.randint(0,10, (1000,100))
y = pd.get_dummies(np.random.randint(0,3, 1000)).values

model = Sequential([
    Dense(128, input_dim = 100),
    Dense(3, activation='softmax'),
])
model.summary()
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)

Keras multi-class classifier probabilities are 0 and 1 valued

Question

2 answers

solution1
1 ACCPTED 2020-05-19 08:47:28

solution2
1 2020-05-19 08:47:45

Keras multi-class classifier probabilities are 0 and 1 valued

Question

2 answers

solution1 1 ACCPTED 2020-05-19 08:47:28

solution2 1 2020-05-19 08:47:45

solution1
1 ACCPTED 2020-05-19 08:47:28

solution2
1 2020-05-19 08:47:45