简体   繁体   English

如何实现神经网络修剪?

[英]How to implement neural network pruning?

I trained a model in keras and I'm thinking of pruning my fully connected network. 我在keras训练了一个模型,我正在考虑修剪我完全连接的网络。 I'm little bit lost on how to prune the layers. 我对如何修剪图层有点迷茫。

Author of 'Learning both Weights and Connections for Efficient Neural Networks', say that they add a mask to threshold weights of a layer. “学习高效神经网络的权重和连接”的作者,说他们为图层的阈值权重添加了一个掩码。 I can try to do the same and fine tune the trained model. 我可以尝试做同样的事情并对训练好的模型进行微调。 But, how does it reduce model size and # of computations? 但是,它如何减少模型大小和计算量?

Based on the discussion in the comments, here is a way to prune a layer (a weight matrix) of your neural network. 根据评论中的讨论,这里是一种修剪神经网络的层(权重矩阵)的方法。 What the method essentially does is selects the k% smallest weights (elements of the matrix) based on their norm, and sets them to zero. 该方法基本上做的是根据它们的范数选择k%最小权重(矩阵的元素),并将它们设置为零。 That way, the corresponding matrix can be treated as a sparse matrix, and we can perform dense-sparse matrix multiplication which can be faster if enough weights are pruned. 这样,相应的矩阵可以被视为稀疏矩阵,并且我们可以执行密集稀疏矩阵乘法,如果足够的权重被修剪,则可以更快。

def weight_pruning(w: tf.Variable, k: float) -> tf.Variable:
    """Performs pruning on a weight matrix w in the following way:

    - The absolute value of all elements in the weight matrix are computed.
    - The indices of the smallest k% elements based on their absolute values are selected.
    - All elements with the matching indices are set to 0.

    Args:
        w: The weight matrix.
        k: The percentage of values (units) that should be pruned from the matrix.

    Returns:
        The unit pruned weight matrix.

    """
    k = tf.cast(tf.round(tf.size(w, out_type=tf.float32) * tf.constant(k)), dtype=tf.int32)
    w_reshaped = tf.reshape(w, [-1])
    _, indices = tf.nn.top_k(tf.negative(tf.abs(w_reshaped)), k, sorted=True, name=None)
    mask = tf.scatter_nd_update(tf.Variable(tf.ones_like(w_reshaped, dtype=tf.float32), name="mask", trainable=False), tf.reshape(indices, [-1, 1]), tf.zeros([k], tf.float32))

    return w.assign(tf.reshape(w_reshaped * mask, tf.shape(w)))

While the method above prunes a single connection (weight), the method below prunes a whole neuron from a weight matrix. 虽然上述方法修剪单个连接(重量),但下面的方法从重量矩阵中修剪整个神经元。 Namely, the method select the k% smallest neurons (columns of the weight matrix) based on the Euclidean norm, and sets them to zero. 即,该方法基于欧几里德范数选择k%最小神经元(权重矩阵的列),并将它们设置为零。

def unit_pruning(w: tf.Variable, k: float) -> tf.Variable:
    """Performs pruning on a weight matrix w in the following way:

    - The euclidean norm of each column is computed.
    - The indices of smallest k% columns based on their euclidean norms are selected.
    - All elements in the columns that have the matching indices are set to 0.

    Args:
        w: The weight matrix.
        k: The percentage of columns that should be pruned from the matrix.

    Returns:
        The weight pruned weight matrix.

    """
    k = tf.cast(
        tf.round(tf.cast(tf.shape(w)[1], tf.float32) * tf.constant(k)), dtype=tf.int32
    )
    norm = tf.norm(w, axis=0)
    row_indices = tf.tile(tf.range(tf.shape(w)[0]), [k])
    _, col_indices = tf.nn.top_k(tf.negative(norm), k, sorted=True, name=None)
    col_indices = tf.reshape(
        tf.tile(tf.reshape(col_indices, [-1, 1]), [1, tf.shape(w)[0]]), [-1]
    )
    indices = tf.stack([row_indices, col_indices], axis=1)

    return w.assign(
        tf.scatter_nd_update(w, indices, tf.zeros(tf.shape(w)[0] * k, tf.float32))
    )

Finally, this Github repository goes through the pruning methods explained here and performs experiments on the MNIST dataset. 最后,这个Github存储库通过这里解释的修剪方法,并在MNIST数据集上执行实验。

If you add a mask, then only a subset of your weights will contribute to the computation, hence your model will be pruned. 如果添加蒙版,则只有一部分权重将有助于计算,因此您的模型将被修剪。 For instance, autoregressive models use a mask to mask out the weights that refer to future data so that the output at time step t only depends on time steps 0, 1, ..., t-1 . 例如,自回归模型使用掩码来屏蔽引用未来数据的权重,使得时间步t的输出仅取决于时间步0, 1, ..., t-1

In your case, since you have a simple fully connected layer, it is better to use dropout. 在您的情况下,由于您有一个简单的完全连接层,因此最好使用dropout。 It randomly turns off some neurons at each iteration step so it reduces the computation complexity. 它在每个迭代步骤中随机关闭一些神经元,因此降低了计算复杂度。 However, the main reason dropout was invented is to tackle overfitting: by having some neurons turned off randomly, you reduce neurons' co-dependencies, ie you avoid that some neurons rely on others. 然而,发明辍学的主要原因是解决过度拟合问题:通过随机关闭一些神经元,可以减少神经元的共同依赖性,即避免某些神经元依赖其他神经元。 Moreover, at each iteration, your model will be different (different number of active neurons and different connections between them), hence your final model can be interpreted as an ensamble (collection) of several diifferent models, each specialized (we hope) in the understanding of a specific subset of the input space. 此外,在每次迭代时,您的模型将是不同的(不同数量的活动神经元和它们之间的不同连接),因此您的最终模型可以被解释为几个不同模型的隐藏(集合),每个模型都是专门的(我们希望)理解输入空间的特定子集。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM