通过删除未使用的层微调 BERT model

Question

I came across this code for BERT sentiment analysis where the unused layers are removed, Update trainable vars/trainable weights are added and I am looking for documentation which shows what are the different layers in bert, how can we remove the unused layers, add weights, etc. However, I am unable to find any documentation for this.我遇到了这段用于 BERT 情感分析的代码，其中删除了未使用的层，添加了更新可训练变量/可训练权重，我正在寻找显示 bert 中不同层的文档，我们如何删除未使用的层，添加权重等。但是，我找不到任何相关文档。

BERT_PATH = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"
MAX_SEQ_LENGTH = 512

class BertLayer(tf.keras.layers.Layer):
  def __init__(self, bert_path, n_fine_tune_encoders=10, **kwargs,):
    self.n_fine_tune_encoders = n_fine_tune_encoders
    self.trainable = True
    self.output_size = 768
    self.bert_path = bert_path
    super(BertLayer, self).__init__(**kwargs)     
  def build(self, input_shape):
    self.bert = tf_hub.Module(self.bert_path,
                              trainable=self.trainable, 
                              name=f"{self.name}_module")
    # Remove unused layers
    trainable_vars = self.bert.variables
    trainable_vars = [var for var in trainable_vars 
                              if not "/cls/" in var.name]
    trainable_layers = ["embeddings", "pooler/dense"]

    # Select how many layers to fine tune
    for i in range(self.n_fine_tune_encoders+1):
        trainable_layers.append(f"encoder/layer_{str(10 - i)}")

    # Update trainable vars to contain only the specified layers
    trainable_vars = [var for var in trainable_vars
                              if any([l in var.name 
                                          for l in trainable_layers])]

    # Add to trainable weights
    for var in trainable_vars:
        self._trainable_weights.append(var)
    for var in self.bert.variables:
        if var not in self._trainable_weights:# and 'encoder/layer' not in var.name:
            self._non_trainable_weights.append(var)
    print('Trainable layers:', len(self._trainable_weights))
    print('Non Trainable layers:', len(self._non_trainable_weights))

    super(BertLayer, self).build(input_shape)
 
  def call(self, inputs):  
    inputs = [K.cast(x, dtype="int32") for x in inputs]
    input_ids, input_mask, segment_ids = inputs
    bert_inputs = dict(input_ids=input_ids, 
                       input_mask=input_mask, 
                       segment_ids=segment_ids)
    
    pooled = self.bert(inputs=bert_inputs, 
                       signature="tokens", 
                       as_dict=True)["pooled_output"]

    return pooled

  def compute_output_shape(self, input_shape):
    return (input_shape[0], self.output_size)

model = build_model(bert_path=BERT_PATH, max_seq_length=MAX_SEQ_LENGTH, n_fine_tune_encoders=10)

Can anyone please point me to where I can find resources to learn the different layers in bert, how to remove some layers, add weights, how many layers to fine-tune, etc.?任何人都可以指出我在哪里可以找到资源来学习 bert 中的不同层、如何删除一些层、添加权重、微调多少层等？

Answer 1

As mentioned in the comments, you can't actually delete layers from the model architecture.如评论中所述，您实际上无法从 model 架构中删除图层。 However, you can freeze layers that you do not want to be trained.但是，您可以冻结不想训练的图层。 So the layer you freeze is not trained and the parameters on that layer are not updated所以你冻结的层没有经过训练，那个层上的参数也没有更新

You can see the layers with this;你可以看到这个层次;

bert_model = AutoModel.from_pretrained("bert-base-uncased")
print(bert_model)
#or
for name, param in model.named_parameters():
    print(name)

You can also freeze a layer or more than one layer like this:您还可以像这样冻结一个层或多个层：

for name, param in self.model.named_parameters():
    if 'classifier' not in name:
        param.requires_grad = False

In example the script above will freeze all layers since it does not contain any layer "classifier" expression and you can get just embedding vectors from bert's output. Apart from this, there is no need to specify a trainable layer, because the layers that you have not already frozen will continue to train.在示例中，上面的脚本将冻结所有层，因为它不包含任何层“分类器”表达式，您可以从 bert 的 output 中获取嵌入向量。除此之外，无需指定可训练层，因为您还没冻好会继续训练。

You can also check out all of bert's heads and layer structures from this document你也可以从这个文档中查看所有bert的头部和层结构

通过删除未使用的层微调 BERT model

问题描述

1 个解决方案

解决方案1
0 2021-09-27 13:18:57

通过删除未使用的层微调 BERT model

问题描述

1 个解决方案

解决方案1 0 2021-09-27 13:18:57

解决方案1
0 2021-09-27 13:18:57