简体   繁体   English

如何删除模型 Tensorflow Keras 中的特定神经元

[英]How to remove a specific neuron inside Model Tensorflow Keras

Is there a way to remove a specific neuron inside a model?有没有办法去除模型中的特定神经元?

For example, i have a model with a Dense layer with 512 neurons.例如,我有一个带有 512 个神经元的 Dense 层的模型。 Is there a way to remove all the neurons that have their indices inside list_indeces ?有没有办法删除所有在list_indeces中有索引的神经元? Ofcourse removing a Neuron will affect the next layer and even the one before.当然,移除一个神经元会影响下一层甚至前一层。

Example:例子:

I have this common model present in multiple papers:我在多篇论文中都有这个通用模型:

data_format = 'channels_last'
    input_shape = [28, 28, 1]
    max_pool = functools.partial(
        tf.keras.layers.MaxPooling2D,
        pool_size=(2, 2),
        padding='same',
        data_format=data_format)
    conv2d = functools.partial(
        tf.keras.layers.Conv2D,
        kernel_size=5,
        padding='same',
        data_format=data_format,
        activation=tf.nn.relu)
    model = tf.keras.models.Sequential([
        conv2d(filters=32, input_shape=input_shape),
        max_pool(),
        conv2d(filters=64),
        max_pool(),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dense(10 if only_digits else 62),
    ])
    return model

Let's say that from the layer tf.keras.layers.Dense(512, activation=tf.nn.relu) i want to remove 100 neurons, basically turning them off.假设从层tf.keras.layers.Dense(512, activation=tf.nn.relu)我想移除 100 个神经元,基本上将它们关闭。

Ofcourse i will have a new model with the layer tf.keras.layers.Dense(412, activation=tf.nn.relu) instead of tf.keras.layers.Dense(512, activation=tf.nn.relu) but this modification should be propagate to the weights of the next layer too, because the connection from the neurons of the dense layer to the next one are deleted too.当然,我将有一个新模型,层为tf.keras.layers.Dense(412, activation=tf.nn.relu)而不是tf.keras.layers.Dense(512, activation=tf.nn.relu)但是这个修改也应该传播到下一层的权重,因为从密集层的神经元到下一层的连接也被删除了。

Any input on how to do so?关于如何这样做的任何意见? I could do this manually by doing something like this:我可以通过执行以下操作来手动执行此操作:

The model shape is this one if i get it correctly: [5, 5, 1, 32], [32], [5, 5, 32, 64], [64], [3136, 512], [512], [512, 62], [62]如果我正确理解,模型形状就是这个: [5, 5, 1, 32], [32], [5, 5, 32, 64], [64], [3136, 512], [512], [512, 62], [62]

So i can do something like this:所以我可以做这样的事情:

  1. Generate all the indices i need and same them inside list_indices生成我需要的所有索引并在list_indices它们相同
  2. Access the weights of the layer tf.keras.layers.Dense(512, activation=tf.nn.relu) , and create a tensor with all the weights that are inside list_indices访问层tf.keras.layers.Dense(512, activation=tf.nn.relu)的权重,并使用list_indices内的list_indices重创建张量
  3. Assign the new tensor of weights to the layer tf.keras.layers.Dense(412, activation=tf.nn.relu) of the submodel将新的权重张量分配给子模型的层tf.keras.layers.Dense(412, activation=tf.nn.relu)

The problem is that i don't know how to get the correct weights of the next layer of weights that corrispond to the indices of the weights i just created and the one that i should assign to the next layer of the submodel.问题是我不知道如何获得下一层权重的正确权重,这些权重对应于我刚刚创建的权重索引以及我应该分配给子模型下一层的权重。 I hope i have exaplained myself clearly.我希望我已经清楚地解释了自己。

Thanks, Leyla.谢谢,莱拉。

Your operation is known in the literature as selective dropout , there is no actual needing to create every time a different model, you just need to multiply the output of your selected neurons by 0, such that the input of the next layer does not take those activations in account.您的操作在文献中被称为selective dropout ,实际上不需要每次都创建不同的模型,您只需要将所选神经元的输出乘以 0,这样下一层的输入就不会采用那些帐户中的激活。

Note that if you "turn off" a neuron in the layer Ln it does not "turn off" completely any neuron in the layer Ln+1 , supposing that both are fully-connected layers (dense): each neuron in the Ln+1 layer is connected to ALL the neurons in the previous layer.请注意,如果您“关闭” Ln层中的一个神经元,它不会完全“关闭” Ln+1层中的任何神经元,假设两者都是完全连接的层(密集): Ln+1中的每个神经元层连接到前一层中的所有神经元。 In other words, removing a neuron in a fully-connected (dense) layer does not affect the dimension of the next one.换句话说,移除全连接(密集)层中的神经元不会影响下一层的维度。

You can simply implement this operation with the Multiply Layer (Keras) .您可以使用Multiply Layer (Keras)简单地实现此操作。 The drawback is that you need to learn how to use the Keras functional API .缺点是您需要学习如何使用Keras 函数式 API There are other ways but are more complex than this (eg. custom layer), also functional APIs are very useful and powerful in many aspects, very suggested reading!还有其他方法但比这更复杂(例如自定义层),而且函数式 API 在许多方面都非常有用和强大,非常建议阅读!

Your model would become like this:你的模型会变成这样:

data_format = 'channels_last'
input_shape = [28, 28, 1]
max_pool = ...
conv2d = ...

# convert a list of indexes to a weight tensor
def make_index_weights(indexes):
    # converting indexes to a list of weights
    indexes = [ float(i not in indexes) for i in range(units) ]
    # converting indexes from list/numpy to tensor
    indexes = tf.convert_to_tensor(indexes)
    # reshaping to the correct format
    indexes = tf.reshape(indexes, (1, units))
    # ensuring it is a float tensor
    indexes = tf.cast(indexes, 'float32')
    return indexes

# layer builder utility
def selective_dropout(units, indexes, **kwargs):
    indexes = make_index_weights(indexes)
    dense = tf.keras.layers.Dense(units, **kwargs)
    mul = tf.keras.layers.Multiply()
    # return the tensor builder
    return lambda inputs: mul([ dense(inputs), indexes ])

input_layer = tf.keras.layers.Input(input_shape)
conv_1  = conv2d(filters=32, input_shape=input_shape)(input_layer)
maxp_1  = max_pool()(conv_1)
conv_2  = conv2d(filters=64)(maxp_1)
maxp_2  = max_pool()(conv_2)
flat    = tf.keras.layers.Flatten()(maxp_2)
sel_drop_1 = selective_dropout(512, INDEXES, activation=tf.nn.relu)(flat)
dense_2 = tf.keras.layers.Dense(10 if only_digits else 62)(sel_drop_1)
output_layer = dense2
model = tf.keras.models.Model([ input_layer ], [ output_layer ])
return model

Now you just need to build up your INDEXES list according to the indexes of those neurons you need to remove.现在你只需要根据你需要删除的那些神经元的索引来建立你的INDEXES列表。

In your case, the tensor would have a shape of 1x512 because there are 512 weights (units/neurons) in the dense layer, so you need to provide as much weights for the indexes.在您的情况下,张量的形状为1x512因为密集层中有 512 个权重(单位/神经元),因此您需要为索引提供尽可能多的权重。 The selective_dropout function allows to pass a list of indexes to discard, and automatically will build up the desired tensor. selective_dropout函数允许传递要丢弃的索引列表,并自动建立所需的张量。

For example if you want to remove the neurons 1, 10, 12 you just pass the list [1, 10, 12] to the function and it will produce a 1x512 tensor with 0.0 at those positions, and 1.0 in all the others.例如,如果您想移除神经元 1, 10, 12,您只需将列表[1, 10, 12]传递给函数,它将生成一个1x512张量,这些位置为0.0 ,所有其他位置为1.0

EDIT:编辑:

As you mentioned you strictly need to reduce the size of the parameters in your model.正如您所提到的,您严格需要减少模型中参数的大小。

Each dense layer is described by the relation y = Wx + B , where W is the kernel (or weights matrix) and B is the bias vector.每个密集层由关系y = Wx + B ,其中W是内核(或权重矩阵), B是偏置向量。 W is a matrix of INPUTxOUTPUT dimensions, where INPUT is the last layer output shape and OUTPUT is the number of neurons/units/weights in the layer; WINPUTxOUTPUT维度的矩阵,其中INPUT是最后一层输出形状, OUTPUT是层中神经元/单元/权重的数量; B is just a vector of dimension 1xOUTPUT (but we are not interested in this). B只是一个维度为1xOUTPUT的向量(但我们对此不感兴趣)。

The problem now is that you are dropping N neurons in the layer Ln and this induce the drop of NxOUTPUT weights in the layer Ln+1 .现在的问题是您在Ln层中丢弃了N神经元,这会导致Ln+1层中的NxOUTPUT权重下降。 Let's be pratic with some numbers.让我们用一些数字来实事求是。 In your case (supposing only_digits as true) you start with:在你的情况下(假设only_digits为真)你开始:

Nx512 -> 512x10 (5120 weights)

And after dropping 100 neurons (it means a drop of 100*10=1000 weights)并且在丢掉 100 个神经元之后(这意味着减少了 100*10=1000 个权重)

Nx412 -> 412x10 (4120 weights)

Now each column of the W matrix describe a neuron (as a vector of weights with a dimension equal to the previous layer output dimension, in our case 512 or 412).现在W矩阵的每一列都描述了一个神经元(作为一个维度等于前一层输出维度的权重向量,在我们的例子中是 512 或 412)。 The rows of the matrix represent instead a single neuron in the previous layer.矩阵的行代表前一层中的单个神经元。

The W[0,0] indicates the relation between the first neuron of layer n and the first of layer n+1 . W[0,0]表示第n层的第一个神经元和第n+1层的第一个神经元之间的关系。

  • W[0,0] -> 1st n, 1st n+1
  • W[0,1] -> 2nd n, 1st n+1
  • W[1,0] -> 1st n, 2nd n+1

And so on.等等。 So you could just remove from this matrix all the rows that are related to the neuron indexes you removed: index 0 -> row 0 .因此,您可以从此矩阵中删除与您删除的神经元索引相关的所有行: index 0 -> row 0

You can access the W matrix as a tensor from the dense layer with dense.kernel您可以使用dense.kernel从密集层访问W矩阵作为张量

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM