简体   繁体   中英

Keras custom loss function - how to access actual truth values and predictions

I am working with time series forecasting with Keras LSTM. I take the last n_input_steps occurrences of the series and try to predict one step forward. For example, if my time series is [1, 2, 3, 4] and n_input_steps = 2, the supervised learning dataset would be:

[1,2]--> 3

[2,3]--> 4

Thus, the series to be forecast (y_true) would be [3,4].

Now I have a Keras model to predict such type of series:

        model = Sequential()
        model.add(LSTM(neurons, activation='relu', input_shape=(n_steps_in, 1)))
        model.add(RepeatVector(1))
        model.add(LSTM(neurons, activation='relu', return_sequences=True))
        model.add(TimeDistributed(Dense(1)))
        model.compile(optimizer='adam', loss=my_loss,run_eagerly=True)
        hist=model.fit(trainX, trainY, epochs=epochs, verbose=2,validation_data=(testX,testY))

And my loss function is:

def my_loss(y_true,y_pred):
    print(kbe.shape(y_true))
    y_true_c = kbe.cast(y_true,'float32')
    y_pred_c = kbe.cast(y_pred,'float32')
    ytn = y_true_c.numpy()
    print(ytn.shape)
    # Do some complex calculation requiring the elements of y_true_c and y_pred_c. 
    # ...
    return result

In my poor understanding, if I call model.fit(trainX, trainY,...) with trainX corresponding to [[1, 2], [2, 3]] (an array in the proper shape) and trainY corresponding to [3, 4], the y_true inside my_loss should be a tensor corresponding to [3, 4]. However this is not that I am finding. The print output of my loss function (the shapes of tensor and array) is:

tf.Tensor([32  1  1], shape=(3,), dtype=int32)
(32, 1, 1)

regardless of the size of the input array. And if I print the values of the array, they have no remembrance to the original values. Even if I remove all the layers of the model, keeping a bare Sequential, I get the same shapes. Therefore, I am completely lost.

Based on the comments above, I did further search and found the response there is a default batch size ruling, as pointed out by Jorge Avila. The length of 32 is the default used by Keras. The truth data and the predicted data come in batches of this size, so I should use batch_size=len(trainX) in the call to model.fit() . Furthermore, on top of that, the data comes in shuffled, that is why it becomes even more confusing. So, I have to use shuffle=False also in model.fit() .

However, as pointed out by Jakub, even with these modifications, my intended loss function will not work because Keras requires symbolic derivatives of the function, which cannot be achieved by having logic which requires the numpy values. So, I have to start from scratch with another loss function acceptable by Keras.

Keep batch_sizes in multiples of 32-1024 depeding on you're data as 2** of always works as it is a common, but you shouldn't have to use shuffle in fit as TimeseriesGenrator is where the changes need to be made not fit.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM