简体   繁体   中英

Loss function to use with image_dataset_from_directory

I am confused about the appropriate loss function to use as i am generating my dataset using image_dataset_from_directory.

Data Generator

Train

train_ds = tf.keras.utils.image_dataset_from_directory(
  '/content/dataset/train',
  validation_split=0.05,
  subset="training",
  seed=123,
  image_size=(IMAGE_SIZE, IMAGE_SIZE),
  batch_size=BATCH_SIZE)

Validation

val_ds =  tf.keras.preprocessing.image_dataset_from_directory(
    '/content/dataset/val', image_size=(IMAGE_SIZE, IMAGE_SIZE), batch_size=BATCH_SIZE
)

Model

rn50v2_model = Sequential()

pretrained_model = tf.keras.applications.ResNet50V2(
    include_top=False,
    weights="imagenet",
    input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3),
    pooling='avg',
    classes = 2
)

print(pretrained_model.summary())

rn50v2_model.add(pretrained_model)

rn50v2_model.add(Flatten())

rn50v2_model.add(Dense(512, activation='relu'))

rn50v2_model.add(Dense(2, activation='softmax'))

#print(rn50v2_model.summary())

rn50v2_model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

when i tested my model, i got my result one hot encoded like below:

array([[0.24823777, 0.7517622 ]], dtype=float32)

I would prefer to use categorical_crossentropy but pls explain this behaviour i cant seem to find information on the official documentation

Your result is not one hot encoded. It is the result of your output layer squished by the softmax function. This function transforms all values between 0 and 1 and these values sum to 1. You can apply np.argmax(predictions, axis=-1) to these "probabilities" to get the corresponding class. To use categorical_crossentropy , try changing your label_mode to categorical , which will automatically generated one-hot-encoded labels:

train_ds = tf.keras.utils.image_dataset_from_directory(
  '/content/dataset/train',
  validation_split=0.05,
  subset="training",
  seed=123,
  label_mode = 'categorical',
  image_size=(IMAGE_SIZE, IMAGE_SIZE),
  batch_size=BATCH_SIZE)

val_ds =  tf.keras.preprocessing.image_dataset_from_directory(
    '/content/dataset/val', 
     subset="validation",
     seed=123, 
     label_mode = 'categorical',
     image_size=(IMAGE_SIZE, IMAGE_SIZE), 
     batch_size=BATCH_SIZE)

You could also consider using binary_crossentropy if you only have two classes. You would have to change your loss function and output layer:

rn50v2_model.add(Dense(1, activation='sigmoid'))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM