I am confused about the appropriate loss function to use as i am generating my dataset using image_dataset_from_directory.
Data Generator
Train
train_ds = tf.keras.utils.image_dataset_from_directory(
'/content/dataset/train',
validation_split=0.05,
subset="training",
seed=123,
image_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE)
Validation
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
'/content/dataset/val', image_size=(IMAGE_SIZE, IMAGE_SIZE), batch_size=BATCH_SIZE
)
Model
rn50v2_model = Sequential()
pretrained_model = tf.keras.applications.ResNet50V2(
include_top=False,
weights="imagenet",
input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3),
pooling='avg',
classes = 2
)
print(pretrained_model.summary())
rn50v2_model.add(pretrained_model)
rn50v2_model.add(Flatten())
rn50v2_model.add(Dense(512, activation='relu'))
rn50v2_model.add(Dense(2, activation='softmax'))
#print(rn50v2_model.summary())
rn50v2_model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
when i tested my model, i got my result one hot encoded like below:
array([[0.24823777, 0.7517622 ]], dtype=float32)
I would prefer to use categorical_crossentropy
but pls explain this behaviour i cant seem to find information on the official documentation
Your result is not one hot encoded. It is the result of your output layer squished by the softmax function. This function transforms all values between 0 and 1 and these values sum to 1. You can apply np.argmax(predictions, axis=-1)
to these "probabilities" to get the corresponding class. To use categorical_crossentropy
, try changing your label_mode
to categorical
, which will automatically generated one-hot-encoded labels:
train_ds = tf.keras.utils.image_dataset_from_directory(
'/content/dataset/train',
validation_split=0.05,
subset="training",
seed=123,
label_mode = 'categorical',
image_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
'/content/dataset/val',
subset="validation",
seed=123,
label_mode = 'categorical',
image_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE)
You could also consider using binary_crossentropy
if you only have two classes. You would have to change your loss function and output layer:
rn50v2_model.add(Dense(1, activation='sigmoid'))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.