简体   繁体   English

如何实现随机深度,并随机丢弃整个卷积层?

[英]How to implement stochastic depth, and randomly dropout an entire convolutional layer?

I'm trying to implement this idea: https://arxiv.org/abs/1603.09382 .我正在尝试实现这个想法: https://arxiv.org/abs/1603.09382 The basic idea is to drop out a Conv2D layer during training based on a "keep prob", like Dropout.基本思想是在基于“保持概率”的训练期间丢弃 Conv2D 层,例如 Dropout。 I thought I could do it with a custom layer like this:我想我可以用这样的自定义层来做到这一点:

class StochasticConv2D(layers.Layer):
    def __init__(self, **kwargs):
        super(StochasticConv2D, self).__init__()
        self.conv2D = layers.Conv2D(**kwargs)

    def call(self, inputs, training, keep_prob):
        if training and (np.random.uniform() > keep_prob):
            return inputs
        return self.conv2D(inputs)

When I try that with training = True, I get this error:当我尝试使用 training = True 时,我收到此错误:

ValueError: tf.function-decorated function tried to create variables on non-first call.

If I get that working, I'm not quite sure how to implement the non-training mode.如果我让它工作,我不太确定如何实现非训练模式。 Do I define the model a second time with training = false and load the weights saved in training?我是否使用 training = false 再次定义 model 并加载训练中保存的权重? And if I pass validation_data to model.fit(), how can "training" be set to false when it runs the validations?如果我将validation_data 传递给model.fit(),那么在运行验证时如何将“training”设置为false?

To randomly freeze filters, you can just make a tf.keras.layers.Dropout layer with the shape of the convolutional filters' number of channels.要随机冻结过滤器,您只需制作一个tf.keras.layers.Dropout层,其形状与卷积过滤器的通道数相同。 Here, we have 10:在这里,我们有 10 个:

import tensorflow as tf
import numpy as np

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(10, 3, input_shape=(28, 28, 1)),
    tf.keras.layers.Dropout(.5, noise_shape=(1, 1, 1, 10))])

x = np.random.rand(1, 28, 28, 1)

np.max(model(x, training=True), axis=(1, 2))
array([[-0.        , -0.        ,  0.        ,  0.53856176, -0.        ,
        -0.        ,  0.16301194, -0.        ,  0.76797724,  0.54769045]],
      dtype=float32)

These are all the max values of the 10 convolutional filters.这些都是 10 个卷积滤波器的最大值。 You see that half of these are just zeroes.您会看到其中一半只是零。

To dropout a layer, you can do something similar:要退出图层,您可以执行类似的操作:

import tensorflow as tf
import numpy as np

conv_dropout_layer = tf.keras.Sequential([
    tf.keras.layers.Conv2D(4, 3),
    tf.keras.layers.Dropout(.5, noise_shape=(1, 1, 1, 1))])

x = np.random.rand(1, 28, 28, 1)

model(x, training=True)

Half the time, all these weights will be frozen.一半的时间,所有这些权重都会被冻结。

To return either the identity of the convolution result, here's what you can do:要返回卷积结果的标识,您可以执行以下操作:

import tensorflow as tf
import numpy as np

class StochasticConv2D(tf.keras.layers.Layer):
    def __init__(self, filters, kernel_size, **kwargs):
        super(StochasticConv2D, self).__init__()
        self.filters = filters
        self.kernel_size = kernel_size
        self.conv2D = tf.keras.layers.Conv2D(filters, kernel_size, padding='SAME', **kwargs)

    def call(self, inputs, **kwargs):
        coin_toss = tf.random.uniform(())
        return tf.cond(tf.greater(.5, coin_toss), lambda: inputs, lambda: self.conv2D(inputs))


x = np.random.rand(1, 7, 7, 10)

s = StochasticConv2D(10, 3)

s(x, training=True).shape

This seems to do it (modified version of previous solution):这似乎可以做到(先前解决方案的修改版本):

class StochasticConv2D(layers.Layer):
    def __init__(self, keep_prob, **kwargs):
        super(StochasticConv2D, self).__init__()
        self.keep_prob = keep_prob
        self.conv2D = layers.Conv2D(**kwargs)

    def call(self, inputs):
        if keras.backend.learning_phase():
            coin_toss = tf.random.uniform(())
            return tf.cond(tf.greater(coin_toss, self.keep_prob), lambda: inputs, lambda: self.conv2D(inputs))
            
        return self.conv2D(inputs)

There's a StochasticDepth layer from tensorflow_addons tensorflow_addons有一个 StochasticDepth 层

import tensorflow_addons as tfa
import numpy as np
import tensorflow as tf

inputs = tf.keras.Input(shape=(28, 28, 1))
residual = tf.keras.layers.Conv2D(32, kernel_size=(3, 3), 
                                  activation="relu",
                                  padding='SAME')(inputs)
x = tfa.layers.StochasticDepth()([inputs, residual])
x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(x)
residual = tf.keras.layers.Conv2D(32, kernel_size=(3, 3), 
                                  activation="relu",
                                  padding='SAME')(x)
x = tfa.layers.StochasticDepth()([x, residual])
x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(x)
x = tf.keras.layers.Flatten()(x)
x = tf.keras.layers.Dropout(0.5)(x)
outputs = tf.keras.layers.Dense(10, 
                                activation="softmax")(x)

model = tf.keras.Model(inputs=inputs, outputs=outputs)

model.summary()

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

x_train = x_train.reshape(60000, 28, 28, 1).astype("float32") / 255
x_test = x_test.reshape(10000, 28, 28, 1).astype("float32") / 255

model.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(),
    optimizer=tf.keras.optimizers.RMSprop(),
    metrics=["accuracy"],
)

history = model.fit(x_train, y_train, 
                    batch_size=64, epochs=2, 
                    validation_split=0.2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM