简体   繁体   中英

Custom loss function with weights in Keras

I'm new with neural networks. I wanted to make a custom loss function in TensorFlow, but I need to get a vector of weights, so I did it in this way:

def my_loss(weights):
  def custom_loss(y, y_pred):
    return weights*(y - y_pred)
  return custom_loss
model.compile(optimizer='adam', loss=my_loss(weights), metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=None,  validation_data=(x_test, y_test), epochs=100)

When I launch it, I receive this error:

InvalidArgumentError:  Incompatible shapes: [50000,10] vs. [32,10]

The shapes are:

print(weights.shape)
print(y_train.shape)
(50000, 10)
(50000, 10)

So I thought that it was a problem with the batches, I don't have a strong background with TensorFlow, so I tried to solve in a naive way using a global variable

batch_index = 0

and then updating it within a custom callback into the "on_batch_begin" hook. But it didn't work and it was a horrible solution. So, how can I get the exact part of the weights with the corresponding y? Do I have a way to get the current batch index inside the custom loss? Thank you in advance for your help

Keras allows you to take any tensors from global scope. Actually, y_true and y_pred might be even not used, as here .

Your model can have multiple inputs (you can make this input dummy on inference, or load weights into model with single input). Notice, that you still need it for validation.

import keras
from keras.layers import *
from keras import backend as K

import numpy as np

inputs_x = Input(shape=(10,))
inputs_w = Input(shape=(10,))

y = Dense(10,kernel_initializer='glorot_uniform' )(inputs_x)

model = keras.Model(inputs=[inputs_x, inputs_w], outputs=[y])

def my_loss(y_true, y_pred):
    return K.abs((y_true-y_pred)*inputs_w)

def my_metrics(y_true, y_pred):
    # just to output something
    return K.mean(inputs_w)



model.compile(optimizer='adam', loss=[my_loss], metrics=[my_metrics])

data = np.random.normal(size=(50000, 10))
labels = np.random.normal(size=(50000, 10))
weights = np.random.normal(size=(50000, 10))


model.fit([data, weights], labels, batch_size=256, validation_data=([data[:100], weights[:100]], labels[:100]), epochs=100)

To make validation without weights, you need to compile another version of the model with different loss which does not use weights.

UPD: Also notice, that Keras will sum up all the elements of your loss, if it returns array instead of scalar


UPD: Tor tensorflow 2.1.0 things become more complicated, it seems. The way to go is in the direction @marco-cerliani pointed out (labels, weighs and data are fed to the model and custom loss tensor is added via .add_loss() ), however his solution didn't work for me out of the box. The first thing is that model does not want to work with None loss, refusing to take both inputs and outputs. So, I introduced additional dummy loss function. The second problem appeared when dataset size was not divisible by batch size. In keras and tf 1.x last batch problem was usually solved by steps_per_epoch and validation_steps parameters, but here if starts to fail on the first batch of Epoch 2. So I needed to make simple custom data generator.

import tensorflow.keras as keras
from tensorflow.keras.layers import *
from tensorflow.keras import backend as K

import numpy as np

inputs_x = Input(shape=(10,))
inputs_w = Input(shape=(10,))
inputs_l = Input(shape=(10,))


y = Dense(10,kernel_initializer='glorot_uniform' )(inputs_x)

model = keras.Model(inputs=[inputs_x, inputs_w, inputs_l], outputs=[y])

def my_loss(y_true, y_pred):
    return K.abs((y_true-y_pred)*inputs_w)

def my_metrics():
    # just to output something
    return K.mean(inputs_w)

def dummy_loss(y_true, y_pred):
    return 0.


loss = my_loss(y, inputs_l)
metric = my_metrics()

model.add_loss(loss)
model.add_metric(metric, name='my_metric', aggregation='mean')


model.compile(optimizer='adam', loss=dummy_loss)

data = np.random.normal(size=(50000, 10))
labels = np.random.normal(size=(50000, 10))
weights = np.random.normal(size=(50000, 10))

dummy = np.zeros(shape=(50000, 10)) # or in can be labels, no matter now


# looks like it does not like when len(data) % batch_size != 0
# If I set steps_per_epoch, it fails on the second epoch.

# So, I proceded with data generator

class DataGenerator(keras.utils.Sequence):
    'Generates data for Keras'
    def __init__(self, x, w, y, y2, batch_size, shuffle=True):
        'Initialization'
        self.x = x
        self.w = w
        self.y = y
        self.y2 = y2
        self.indices = list(range(len(self.x)))
        self.shuffle = shuffle
        self.batch_size = batch_size
        self.on_epoch_end()

    def __len__(self):
        'Denotes the number of batches per epoch'
        return len(self.indices) // self.batch_size

    def __getitem__(self, index):
        'Generate one batch of data'
        # Generate indexes of the batch

        ids = self.indices[index*self.batch_size:(index+1)*self.batch_size]

        # the last None to remove weird warning
        # https://stackoverflow.com/questions/59317919
        return [self.x[ids], self.w[ids], self.y[ids]], self.y2[ids], [None]

    def on_epoch_end(self):
        'Updates indexes after each epoch'
        if self.shuffle == True:
            np.random.shuffle(self.indices)

batch_size = 256

train_generator = DataGenerator(data,weights,labels, dummy, batch_size=batch_size, shuffle=True)

val_generator = DataGenerator(data[:2*batch_size],weights[:2*batch_size],labels[:2*batch_size], dummy[:2*batch_size], batch_size=batch_size, shuffle=True)

model.fit(x=train_generator, validation_data=val_generator,epochs=100)

this is a workaround to pass additional arguments to a custom loss function, in your case an array of weights. the trick consists in using fake inputs which are useful to build and use the loss in the correct ways. don't forget that keras handles fixed batch dimension

I provide a dummy example in a regression problem

def mse(y_true, y_pred, weights):
    error = y_true-y_pred
    return K.mean(K.square(error) + K.sqrt(weights))

X = np.random.uniform(0,1, (1000,10))
y = np.random.uniform(0,1, 1000)
w = np.random.uniform(0,1, 1000)

inp = Input((10,))
true = Input((1,))
weights = Input((1,))
x = Dense(32, activation='relu')(inp)
out = Dense(1)(x)

m = Model([inp,true,weights], out)
m.add_loss( mse( true, out, weights ) )
m.compile(loss=None, optimizer='adam')
m.fit(x=[X, y, w], y=None, epochs=3)

## final fitted model to compute predictions (remove W if not needed)
final_m = Model(inp, out)

Like @Michael Moretti, I too am new at all this (deep learning, Python, TensorFlow, Keras, ...). This question was asked about 19 months ago, and things move fast in “TF years.”

Apparently at some point, you could just write a Python function with arguments (y_true, y_pred) and pass it to your call to model.compile() and all was well. Now that seems to work in some simple cases, but not in general. While trying to understand why it was not working for me, I found this SO question and other related ones. It was @M.Innat's answer to this question that got me on the right track. But in fact his relevant final example CustomMSE is cribbed from the Keras Guide section on Custom Losses . This example shows both how to write a custom loss fully compatible with TensorFlow version: 2.7.0 , as well as how to pass additional parameters to it via the constructor of a class based on keras.losses.Loss in the call to model.compile() :

class CustomMSE(keras.losses.Loss):
    def __init__(self, regularization_factor=0.1, name="custom_mse"):
        super().__init__(name=name)
        self.regularization_factor = regularization_factor

    def call(self, y_true, y_pred):
        mse = tf.math.reduce_mean(tf.square(y_true - y_pred))
        reg = tf.math.reduce_mean(tf.square(0.5 - y_pred))
        return mse + reg * self.regularization_factor

model.compile(optimizer=keras.optimizers.Adam(), loss=CustomMSE())

For best results, make sure that all computation inside your custom loss function (that is, the call() method of your custom Loss class) is done with TensorFlow operators, and that all input and output data is represented as TF tensors.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM