简体   繁体   中英

Neural Network prediction intervals - MVE method

I want to train a neural network which also returns prediction intervals, so that I can have some idea of my confidence in a prediction. There seems to be four main methods of achieving this, which are summarized in the paper "Comprehensive Review of Neural Network-Based Prediction Intervals and New Advances": https://ieeexplore.ieee.org/document/5966350

I am interested in the mean-variance estimation (MVE) method because it seems to be the simplest to understand. However I am struggling to get my head around exactly how this would be implemented in Keras.

I would guess the loss function would be defined by:

def mve_cost(y_true, y_pred, var_pred):
  loss = 0.5*tf.reduce_sum(tf.log(var_pred) + tf.divide((tf.square(y_true - y_pred)),(tf.square(var_pred)))  )
  return loss

But can a loss function in Keras take three inputs? I have never seen this before. Also, the target for the variance-NN is not known beforehand and takes into account the predictions made by the mean-NN. I suppose this will need some of the more flexible capabilities of the Keras Functional API but I'm confused about how it would be put together.

  • How do you define the loss function properly for the MVE method?
  • How can the tricky relationship between the two NNs be implemented in the Keras functional API?
  • Does anyone know of an implementation of this method already online?
  • Is there another method of generating prediction intervals for NNs that is more easily understood/implemented in Keras?

Methods like these are not easy to implement, but there is a trick. Define the loss like this:

import keras.backend as K

def regression_nll_loss(sigma_sq, epsilon = 1e-6):
    def nll_loss(y_true, y_pred):
        return 0.5 * K.mean(K.log(sigma_sq + epsilon) + K.square(y_true - y_pred) / (sigma_sq + epsilon))

    return nll_loss

Then define a model with two outputs, one for the mean and another for the variance:

from keras.models import Model
from keras.layers import Dense, Input

inp = Input(shape=(1,))
x = Dense(32, activation="relu")(inp)
x = Dense(32, activation="relu")(x)
mean = Dense(1, activation="linear")(x)
var = Dense(1, activation="softplus")(x)

train_model = Model(inp, mean)
pred_model = Model(inp, [mean, var])

train_model.compile(loss=regression_nll_loss(var), optimizer="adam")

train_model.fit(x, y, ...)

mean, var = pred_model.predict(some_input)

The trick is to explicitly pass the tensor for the variance to the loss, so it only needs two inputs, and supervision is only performed for the mean. Then you need to define two models that share weights, one for training, and another for testing/inference. This latter model returns both the mean and variance.

Remember to use a softplus activation for the variance to keep it positive. I have implemented this loss for use with Deep Ensembles , you can find an example here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM