简体   繁体   English

在TensorFlow或PyTorch中仅创建和训练指定的权重

[英]Creating and train only specified weights in TensorFlow or PyTorch

I am wondering if there is a way in TensorFlow, PyTorch or some other library to selectively connect neurons. 我想知道TensorFlow,PyTorch或其他库中是否可以有选择地连接神经元。 I want to make a network with a very large number of neurons in each layer, but that has very few connections between layers. 我想建立一个在每一层中都有大量神经元的网络,但是层之间的连接很少。

Note that I do not think this is a duplicate of this answer: Selectively zero weights in TensorFlow? 请注意,我不认为这是该答案的重复: TensorFlow中的权重选择性为零? . I implemented a custom keras layer using essentially the same method that appears in that question - essentially by creating a dense layer where all but the specified weights are ignored in training and evaluation. 我使用了与该问题中出现的方法基本相同的方法来实现自定义keras层-基本上是通过创建一个密集层,在该层中除指定权重之外的所有权重在训练和评估中均被忽略。 This fulfills part of what I want to do by not training specified weights, and not using them for prediction. 通过不训练指定的权重并且不将其用于预测,这满足了我想做的部分工作。 But, the problems is that I still waste memory saving the untrained weights, and I waste time calculating the gradients of the zeroed weights. 但是,问题在于我仍然浪费内存来保存未训练的权重,并且浪费时间来计算归零权重的梯度。 What I would like is for the computation of the gradient matrices to involve only sparse matrices, so that I do not waste time and memory. 我想要的是梯度矩阵的计算仅涉及稀疏矩阵,这样我就不会浪费时间和内存。

Is there a way to selectively create and train weights without wasting memory? 有没有办法在不浪费内存的情况下选择性地创建和训练权重? If my question is unclear or there is more information that it would be helpful for me to provide, please let me know. 如果我的问题不清楚,或者有更多信息对我有帮助,请告诉我。 I would like to be helpful as a question-asker. 作为提问者,我希望对您有所帮助。

The usual, simple solution is to initialize your weight matrices to have zeros where there should be no connection. 通常的简单解决方案是将您的权重矩阵初始化为零(应在没有连接的地方)。 You store a mask of the location of these zeros, and set the weights at these positions to zero after each weight update. 您存储这些零的位置的掩码,并在每次权重更新后将这些位置的权重设置为零。 You need to do this as the gradient for zero weights may be nonzero, and this would introduce nonzero weights (ie connectios) where you don't want any. 您需要执行此操作,因为零权重的梯度可能不为零,并且这会在您不需要的地方引入非零权重(即connectios)。

Pseudocode: 伪代码:

# setup network
weights = sparse_init()  # only nonzero for existing connections
zero_mask = where(weights == 0)

# train
for e in range(num_epochs):
    train_operation()  # may lead to introduction of new connections
    weights[zero_mask] = 0  # so we set them to zero again

Both tensorflow and pytorch support sparse tensors ( torch.sparse , tf.sparse ). tensorflow和pytorch都支持稀疏张量( torch.sparsetf.sparse )。

My intuitive understanding would be that if you were willing to write your network using the respective low level APIs (eg actually implementing the forward-pass yourself), you could cast your weight matrices as sparse tensors. 我的直觉理解是,如果您愿意使用相应的低级API(例如,自己实际实现前向通行)来编写网络,则可以将权重矩阵转换为稀疏张量。 That would in turn result in sparse connectivity, since the weight matrix of layer [L] defines the connectivity between neurons of the previous layer [L-1] with neurons of layer [L]. 这又将导致连接稀疏,因为层[L]的权重矩阵定义了先前层[L-1]的神经元与层[L]的神经元之间的连接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM