I am implementing a classifier to identify 3 different types of images, my last layer has 3 neurons with sigmoid activation
from keras.model import Sequential
from keras.layers import Conv2D, MaxPooling2D, BatchNormalization, Dropout, Flatten, Dense
model = Sequential()
model.add(Conv2D(16, kernel_size=(3, 3), activation='relu',
input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.2))
# more conv layers
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(3, activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(), metrics=['accuracy'])
The training set label uses one hot encoding, and there are abundant training examples for each of the 3 categories.
But when I run model.predict(X)
on the test set, the first 10 output is
[[0. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]
[0. 1. 1.]
[1. 1. 1.]
[0. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]]
model.predict()
should output probabilities and each row should sum up to 1, but in the actual results, sometimes every category has a probability of 1. Does anyone know why the probabilities come up this way?
Use softmax
in the last layer instead of sigmoid
.
model.add(Dense(3, activation='softmax'))
For multi-class classification, to get probability, softmax is needed.
The problem is in your output layer. your problem is a MULTICLASS so you have to deal it in the correct way. These are the possibilities:
if you have 1D integer encoded target you can use sparse_categorical_crossentropy as loss function and softmax as final activation
X = np.random.randint(0,10, (1000,100))
y = np.random.randint(0,3, 1000)
model = Sequential([
Dense(128, input_dim = 100),
Dense(3, activation='softmax'),
])
model.summary()
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)
Otherwise, if you have one-hot encoded your target in order to have 2D shape (n_samples, n_class) you can use categorical_crossentropy and softmax as final activation
X = np.random.randint(0,10, (1000,100))
y = pd.get_dummies(np.random.randint(0,3, 1000)).values
model = Sequential([
Dense(128, input_dim = 100),
Dense(3, activation='softmax'),
])
model.summary()
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, epochs=3)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.