How to access all outputs from a single custom loss function in keras

Question

I'm trying to reproduce the architecture of the network proposed in this publication in tensorFlow. Being a total beginner to this, I've been using this tutorial as a base to work on, using tensorflow==2.3.2.

To train this network, they use a loss which implies outputs from two branches of the network at the same time, which made me look towards custom losses function in keras. I've got that you can define your own, as long as the definition of the function looks like the following:

def custom_loss(y_true, y_pred):

I also understood that you could give other arguments like so:

def loss_function(margin=0.3):
    def custom_loss(y_true, y_pred):
        # And now you can use margin

You then just have to call these while compiling your model. When it comes to using multiple outputs, the most common approach seem to be the one proposed here , where you would give several losses functions, one being called for each of your output. However, I could not find a solution to give several outputs to a loss function, which is what I need here.

To further explain it, here is a minimal working example showing what I've tried, which you can try for yourself in this collab .

import os
import tensorflow as tf
import keras.backend as K
from tensorflow.keras import datasets, layers, models, applications, losses
from tensorflow.keras.preprocessing import image_dataset_from_directory

_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')

train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')

BATCH_SIZE = 32
IMG_SIZE = (160, 160)
IMG_SHAPE = IMG_SIZE + (3,)

train_dataset = image_dataset_from_directory(train_dir,
                                             shuffle=True,
                                             batch_size=BATCH_SIZE,
                                             image_size=IMG_SIZE)

validation_dataset = image_dataset_from_directory(validation_dir,
                                                  shuffle=True,
                                                  batch_size=BATCH_SIZE,
                                                  image_size=IMG_SIZE)

data_augmentation = tf.keras.Sequential([
  layers.experimental.preprocessing.RandomFlip('horizontal'),
  layers.experimental.preprocessing.RandomRotation(0.2),
])
preprocess_input = applications.resnet50.preprocess_input
base_model = applications.ResNet50(input_shape=IMG_SHAPE,
                                               include_top=False,
                                               weights='imagenet')
base_model.trainable = True
conv = layers.Conv2D(filters=128, kernel_size=(1,1))
global_pooling = layers.GlobalAveragePooling2D()
horizontal_pooling = layers.AveragePooling2D(pool_size=(1, 5))
reshape = layers.Reshape((-1, 128))

def custom_loss(y_true, y_pred):
    print(y_pred.shape)
    # Do some stuffs involving both outputs
    # Returning something trivial here for correct behavior
    return K.mean(y_pred)

inputs = tf.keras.Input(shape=IMG_SHAPE)
x = data_augmentation(inputs)
x = preprocess_input(x)
x = base_model(x, training=True)

first_branch = global_pooling(x)

second_branch = conv(x)
second_branch = horizontal_pooling(second_branch)
second_branch = reshape(second_branch)

model = tf.keras.Model(inputs, [first_branch, second_branch])
base_learning_rate = 0.0001
model.compile(optimizer=tf.keras.optimizers.Adam(lr=base_learning_rate),
              loss=custom_loss,
              metrics=['accuracy'])
model.summary()

initial_epochs = 10
history = model.fit(train_dataset,
                    epochs=initial_epochs,
                    validation_data=validation_dataset)

while doing so, I thought that the y_pred given to loss function would be a list, containing both outputs. However, while running it, what I've got in stdout was this:

Epoch 1/10
(None, 2048)
(None, 5, 128)

What I understand from this is that the loss function is called with every output, one by one, instead of being called once with all the outputs, which means I can't define a loss that would use both the outputs at the same time. Is there any way to achieve this?

Please let me know if I'm unclear, or if you need further details.

Answer 1

Ok, here is an easy way to achieve this. We can achieve this by using the loss_weights parameter. We can weigh multiple outputs exactly the same so that we can get the combined loss results. So, for two output we can do

loss_weights = 1*output1 + 1*output2

In your case, your network has two outputs, by the name they are reshape , and global_average_pooling2d . You can do now as follows

# calculation of loss for one output, i.e. reshape
def reshape_loss(y_true, y_pred):
    # do some math with these two 
    return K.mean(y_pred)

# calculation of loss for another output, i.e. global_average_pooling2d
def gap_loss(y_true, y_pred):
    # do some math with these two 
    return K.mean(y_pred)

And while compiling now you need to do as this

model.compile(
    optimizer=tf.keras.optimizers.Adam(lr=base_learning_rate), 
    loss = {
         'reshape':reshape_loss, 
         'global_average_pooling2d':gap_loss
      },
    loss_weights = {
        'reshape':1., 
        'global_average_pooling2d':1.
     }
    )

Now, the loss is the result of 1.*reshape + 1.*global_average_pooling2d .

How to access all outputs from a single custom loss function in keras

Question

1 answers

solution1
0 2021-03-05 20:09:26

How to access all outputs from a single custom loss function in keras

Question

1 answers

solution1 0 2021-03-05 20:09:26

solution1
0 2021-03-05 20:09:26