简体   繁体   English

通过删除未使用的层微调 BERT model

[英]Fine-tune BERT model by removing unused layers

I came across this code for BERT sentiment analysis where the unused layers are removed, Update trainable vars/trainable weights are added and I am looking for documentation which shows what are the different layers in bert, how can we remove the unused layers, add weights, etc. However, I am unable to find any documentation for this.我遇到了这段用于 BERT 情感分析的代码,其中删除了未使用的层,添加了更新可训练变量/可训练权重,我正在寻找显示 bert 中不同层的文档,我们如何删除未使用的层,添加权重等。但是,我找不到任何相关文档。

BERT_PATH = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"
MAX_SEQ_LENGTH = 512

class BertLayer(tf.keras.layers.Layer):
  def __init__(self, bert_path, n_fine_tune_encoders=10, **kwargs,):
    self.n_fine_tune_encoders = n_fine_tune_encoders
    self.trainable = True
    self.output_size = 768
    self.bert_path = bert_path
    super(BertLayer, self).__init__(**kwargs)     
  def build(self, input_shape):
    self.bert = tf_hub.Module(self.bert_path,
                              trainable=self.trainable, 
                              name=f"{self.name}_module")
    # Remove unused layers
    trainable_vars = self.bert.variables
    trainable_vars = [var for var in trainable_vars 
                              if not "/cls/" in var.name]
    trainable_layers = ["embeddings", "pooler/dense"]

    # Select how many layers to fine tune
    for i in range(self.n_fine_tune_encoders+1):
        trainable_layers.append(f"encoder/layer_{str(10 - i)}")

    # Update trainable vars to contain only the specified layers
    trainable_vars = [var for var in trainable_vars
                              if any([l in var.name 
                                          for l in trainable_layers])]

    # Add to trainable weights
    for var in trainable_vars:
        self._trainable_weights.append(var)
    for var in self.bert.variables:
        if var not in self._trainable_weights:# and 'encoder/layer' not in var.name:
            self._non_trainable_weights.append(var)
    print('Trainable layers:', len(self._trainable_weights))
    print('Non Trainable layers:', len(self._non_trainable_weights))

    super(BertLayer, self).build(input_shape)
 
  def call(self, inputs):  
    inputs = [K.cast(x, dtype="int32") for x in inputs]
    input_ids, input_mask, segment_ids = inputs
    bert_inputs = dict(input_ids=input_ids, 
                       input_mask=input_mask, 
                       segment_ids=segment_ids)
    
    pooled = self.bert(inputs=bert_inputs, 
                       signature="tokens", 
                       as_dict=True)["pooled_output"]

    return pooled

  def compute_output_shape(self, input_shape):
    return (input_shape[0], self.output_size)

model = build_model(bert_path=BERT_PATH, max_seq_length=MAX_SEQ_LENGTH, n_fine_tune_encoders=10)

Can anyone please point me to where I can find resources to learn the different layers in bert, how to remove some layers, add weights, how many layers to fine-tune, etc.?任何人都可以指出我在哪里可以找到资源来学习 bert 中的不同层、如何删除一些层、添加权重、微调多少层等?

As mentioned in the comments, you can't actually delete layers from the model architecture.如评论中所述,您实际上无法从 model 架构中删除图层。 However, you can freeze layers that you do not want to be trained.但是,您可以冻结不想训练的图层。 So the layer you freeze is not trained and the parameters on that layer are not updated所以你冻结的层没有经过训练,那个层上的参数也没有更新

You can see the layers with this;你可以看到这个层次;

bert_model = AutoModel.from_pretrained("bert-base-uncased")
print(bert_model)
#or
for name, param in model.named_parameters():
    print(name)

You can also freeze a layer or more than one layer like this:您还可以像这样冻结一个层或多个层:

for name, param in self.model.named_parameters():
    if 'classifier' not in name:
        param.requires_grad = False

In example the script above will freeze all layers since it does not contain any layer "classifier" expression and you can get just embedding vectors from bert's output. Apart from this, there is no need to specify a trainable layer, because the layers that you have not already frozen will continue to train.在示例中,上面的脚本将冻结所有层,因为它不包含任何层“分类器”表达式,您可以从 bert 的 output 中获取嵌入向量。除此之外,无需指定可训练层,因为您还没冻好会继续训练。

You can also check out all of bert's heads and layer structures from this document你也可以从这个文档中查看所有bert的头部和层结构

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 针对特定于上下文的嵌入微调 BERT model - Fine-tune a BERT model for context specific embeddigns 为特定领域微调 Bert(无监督) - Fine-tune Bert for specific domain (unsupervised) 如何在 Keras 中微调功能模型? - How to fine-tune a functional model in Keras? 如何微调 PyTorch 中修剪后的 model? - How to fine-tune the pruned model in PyTorch? 如何使用`MonitoredTrainingSession` /`Scaffold`微调模型 - How to fine-tune model using `MonitoredTrainingSession` / `Scaffold` 运行 run_squad.py 以微调 Google BERT 模型(官方 tensorflow 预训练模型)时无法加载(恢复)TensorFlow 检查点 - Failed to load(restore) TensorFlow checkpoint when running run_squad.py to fine-tune the Google BERT model(official tensorflow pre-trained model) 如何使用现有的和更新的类微调 keras model? - How to fine-tune a keras model with existing plus newer classes? 为我的 CNN 运行微调 model:值错误 - Running a fine-tune model for my CNN : Value Error 当我使用 tensorflow 2.8.0 微调 bert 时遇到这个错误:hub.KerasLayer(tfhub_handle_preprocess) 'CaseFoldUTF8'? - When i use tensorflow 2.8.0 to fine-tune bert meet this bug: hub.KerasLayer(tfhub_handle_preprocess) 'CaseFoldUTF8'? 需要微调 BERT 模型以预测丢失的单词 - Need to Fine Tune a BERT Model to Predict Missing Words
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM