简体   繁体   中英

Extracting the dropout mask from a keras dropout layer?

I would like to extract and store the dropout mask [array of 1/0s] from a dropout layer in a Sequential Keras model at each batch while training. I was wondering if there was a straight forward way way to do this within Keras or if I would need to switch over to tensorflow ( How to get the dropout mask in Tensorflow ).

Would appreciate any help. I'm quite new to TensorFlow and Keras.

There are a couple of functions (dropout_layer.get_output_mask(), dropout_layer.get_input_mask()) for the dropout layer that I tried using but got None after calling on the previous layer.

model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten(name="flat", input_shape=(28, 28, 1)))
model.add(tf.keras.layers.Dense(
    512,
    activation='relu',
    name = 'dense_1',
    kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
    bias_initializer='zeros'))
dropout = tf.keras.layers.Dropout(0.2, name = 'dropout') #want this layer's mask

model.add(dropout)
x = dropout.output_mask
y = dropout.input_mask
model.add(tf.keras.layers.Dense(
    10,
    activation='softmax',
    name='dense_2',
    kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
    bias_initializer='zeros'))

model.compile(...)
model.fit(...)

It's not easily exposed in Keras. It goes deep until it calls the Tensorflow dropout.

So, although you're using Keras, it's will also be a tensor in the graph that can be gotten by name (finding it's name: In Tensorflow, get the names of all the Tensors in a graph ).

This option, of course will lack some keras information, you should probably have to do that inside a Lambda layer so Keras adds certain information to the tensor. And you must take extra care because the tensor will exist even when not training (where the mask is skipped)

Now, you can also use a less hacky way, that may consume a little processing:

def getMask(x):
    boolMask = tf.not_equal(x, 0)
    floatMask = tf.cast(boolMask, tf.float32) #or tf.float64
    return floatMask

Use a Lambda(getMasc)(output_of_dropout_layer)

But instead of using a Sequential model, you will need a functional API Model .

inputs = tf.keras.layers.Input((28, 28, 1))
outputs = tf.keras.layers.Flatten(name="flat")(inputs)
outputs = tf.keras.layers.Dense(
    512,
    #    activation='relu', #relu will be a problem here
    name = 'dense_1',
    kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
    bias_initializer='zeros')(outputs)

outputs = tf.keras.layers.Dropout(0.2, name = 'dropout')(outputs)
mask = Lambda(getMask)(outputs)
#there isn't "input_mask"


#add the missing relu: 
outputs = tf.keras.layers.Activation('relu')(outputs)
outputs = tf.keras.layers.Dense(
    10,
    activation='softmax',
    name='dense_2',
    kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
    bias_initializer='zeros')(outputs)

model = Model(inputs, outputs)
model.compile(...)
model.fit(...)

Training and predicting

Since you can't train the masks (it doesn't make any sense), it should not be an output of the model for training.

Now, we could try this:

trainingModel = Model(inputs, outputs)    
predictingModel = Model(inputs, [output, mask])    

But masks don't exist in prediction, because dropout is only applied in training. So this doesn't bring us anything good in the end.

The only way for training is then using a dummy loss and dummy targets:

def dummyLoss(y_true, y_pred):
    return y_true #but this might evoke a "None" gradient problem since it's not trainable, there is no connection to any weights, etc.    

model.compile(loss=[loss_for_main_output, dummyLoss], ....)

model.fit(x_train, [y_train, np.zeros((len(y_Train),) + mask_shape), ...)

It's not guaranteed that these will work.

I found a very hacky way to do this by trivially extending the provided dropout layer. (Almost all code from TF .)

class MyDR(tf.keras.layers.Layer):
def __init__(self,rate,**kwargs):
    super(MyDR, self).__init__(**kwargs)

    self.noise_shape = None
    self.rate = rate


def _get_noise_shape(self,x, noise_shape=None):
    # If noise_shape is none return immediately.
    if noise_shape is None:
        return array_ops.shape(x)
    try:
        # Best effort to figure out the intended shape.
        # If not possible, let the op to handle it.
        # In eager mode exception will show up.
        noise_shape_ = tensor_shape.as_shape(noise_shape)
    except (TypeError, ValueError):
        return noise_shape

    if x.shape.dims is not None and len(x.shape.dims) == len(noise_shape_.dims):
        new_dims = []
        for i, dim in enumerate(x.shape.dims):
            if noise_shape_.dims[i].value is None and dim.value is not None:
                new_dims.append(dim.value)
            else:
                new_dims.append(noise_shape_.dims[i].value)
        return tensor_shape.TensorShape(new_dims)

    return noise_shape

def build(self, input_shape):
    self.noise_shape = input_shape
    print(self.noise_shape)
    super(MyDR,self).build(input_shape)

@tf.function
def call(self,input):
    self.noise_shape = self._get_noise_shape(input)
    random_tensor = tf.random.uniform(self.noise_shape, seed=1235, dtype=input.dtype)
    keep_prob = 1 - self.rate
    scale = 1 / keep_prob
    # NOTE: if (1.0 + rate) - 1 is equal to rate, then we want to consider that
    # float to be selected, hence we use a >= comparison.
    self.keep_mask = random_tensor >= self.rate
    #NOTE: here is where I save the binary masks. 
    #the file grows quite big!
    tf.print(self.keep_mask,output_stream="file://temp/droput_mask.txt")

    ret = input * scale * math_ops.cast(self.keep_mask, input.dtype)
    return ret

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM