简体   繁体   中英

Custom loss w weight arrays of batch size in tensorflow/keras

I am creating a custom loss function, which is a MAE( y_true , y_pred ), weighted by two arrays, a and b , where all four arrays are of the same size (10000 samples/timesteps).

def custom_loss(y_true, y_pred, a, b):
        mae = K.abs(y_true - y_pred)
        loss = mae * a * b
        return loss

Question: How can I feed a and b into the function? Both should be split and shuffled just like y_true and y_pred.

So far, I am using a LSTM trained on data X of shape (samples x time steps x variables). Here, I tried tf's add_loss function to get this done, which resulted in errors due to different data shapes, when passing a and b as further input layers.

input_layer = Input(shape=input_shape)
in = LSTM(20, activation='relu', return_sequences=True)(input_layer)
out = LSTM(1, activation='linear', return_sequences=False)(in)

layer_a = Input(shape=(10000))
layer_b = Input(shape=(10000))

model = Model(inputs = [input_layer, layer_a, layer_b], outputs = out)  
model.add_loss(custom_loss(input_layer, out, layer_a, layer_b))
model.compile(loss=None, optimizer=Adam(0.01))

# X=data of shape 20 variables x 10000 timesteps, y, a, b = data of shape 10000 timesteps
model.fit(x=[X, a, b], y=y, batch_size=1, shuffle=True)

How do I do this correctly?

as you introduced, you have to use add_loss . remember to pass to your loss all the variables (trues, predictions, and extra tensors in the correct format).

n_sample = 100
timesteps = 30
features = 5

X = np.random.uniform(0,1, (n_sample,timesteps,features))
y = np.random.uniform(0,1, n_sample)
a = np.random.uniform(0,1, n_sample)
b = np.random.uniform(0,1, n_sample)

def custom_loss(y_true, y_pred, a, b):
    mae = K.abs(y_true - y_pred)
    loss = mae * a * b
    return loss

input_layer = Input(shape=(timesteps, features))
x = LSTM(20, activation='relu', return_sequences=True)(input_layer)
out = LSTM(1, activation='linear')(x)

layer_a = Input(shape=(1,))
layer_b = Input(shape=(1,))
target = Input(shape=(1,))

model = Model(inputs = [target, input_layer, layer_a, layer_b], outputs = out)  
model.add_loss(custom_loss(target, out, layer_a, layer_b))
model.compile(loss=None, optimizer=Adam(0.01))

model.fit(x=[y, X, a, b], y=None, shuffle=True, epochs=3)

to use the model in inference mode (remove y as input and a and b if not needed):

final_model = Model(model.inputs[1], model.output)

If you just need a and b for the calculation of the loss function, then I would write a wrapper around your custom loss function, and pass a tuple (y,a,b) as your labels.

Something like that:

n_sample = 100
timesteps = 30
features = 5

X = np.random.uniform(0,1, (n_sample,timesteps,features))
y = np.random.uniform(0,1, n_sample)
a = np.random.uniform(0,1, n_sample)
b = np.random.uniform(0,1, n_sample)

def custom_loss_wrapper(y_true, y_pred):
    def custom_loss(y_true, y_pred, a, b):
        mae = K.abs(y_true - y_pred)
        loss = mae * a * b
        return loss
    return custom_loss(y_true[0], y_pred, y_true[1], y_true[2])

input_layer = Input(shape=(timesteps, features))
x = LSTM(20, activation='relu', return_sequences=True)(input_layer)
out = LSTM(1, activation='linear')(x)

model = Model(inputs =input_layer, outputs = out)  
model.compile(loss=custom_loss_wrapper, optimizer=Adam(0.01))

model.fit(x=X, y=(y,a,b), shuffle=True, epochs=3)

It simplifies the network architecture and removes the unnecessary layer_a and layer_b at inference time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM