简体   繁体   English

预训练的 BERT model 的权重未初始化

[英]Weights of pre-trained BERT model not initialized

I am using the Language Interpretability Toolkit (LIT) to load and analyze a BERT model that I pre-trained on an NER task.我正在使用语言解释工具包(LIT) 加载和分析我在 NER 任务上预训练的 BERT model。

However, when I'm starting the LIT script with the path to my pre-trained model passed to it, it fails to initialize the weights and tells me:但是,当我使用传递给它的预训练 model 的路径启动 LIT 脚本时,它无法初始化权重并告诉我:

    modeling_utils.py:648] loading weights file bert_remote/examples/token-classification/Data/Models/results_21_03_04_cleaned_annotations/04.03._8_16_5e-5_cleaned_annotations/04-03-2021 (15.22.23)/pytorch_model.bin
    modeling_utils.py:739] Weights of BertForTokenClassification not initialized from pretrained model: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
    modeling_utils.py:745] Weights from pretrained model not used in BertForTokenClassification: ['bert.embeddings.position_ids']

It then simply uses the bert-base-german-cased version of BERT, which of course doesn't have my custom labels and thus fails to predict anything.然后它只是使用 BERT 的bert-base-german-cased版本,它当然没有我的自定义标签,因此无法预测任何东西。 I think it might have to do with PyTorch, but I can't find the error.我认为这可能与 PyTorch 有关,但我找不到错误。

If relevant, here is how I load my dataset into CoNLL 2003 format (modification of the dataloader scripts found here ):如果相关,这里是我如何将我的数据集加载到 CoNLL 2003 格式(在此处找到的数据加载器脚本的修改):

    def __init__(self):

        # Read ConLL Test Files

        self._examples = []

        data_path = "lit_remote/lit_nlp/examples/datasets/NER_Data"
        with open(os.path.join(data_path, "test.txt"), "r", encoding="utf-8") as f:
            lines = f.readlines()

        for line in lines[:2000]:
            if line != "\n":
                token, label = line.split(" ")
                self._examples.append({
                    'token': token,
                    'label': label,
                })
            else:
                self._examples.append({
                    'token': "\n",
                    'label': "O"
                })

    def spec(self):
        return {
            'token': lit_types.Tokens(),
            'label': lit_types.SequenceTags(align="token"),
        }

And this is how I initialize the model and start the LIT server (modification of the simple_pytorch_demo.py script found here ):这就是我如何初始化 model 并启动 LIT 服务器(修改simple_pytorch_demo.py脚本在这里找到):

    def __init__(self, model_name_or_path):
        self.tokenizer = transformers.AutoTokenizer.from_pretrained(
            model_name_or_path)
        model_config = transformers.AutoConfig.from_pretrained(
            model_name_or_path,
            num_labels=15,  # FIXME CHANGE
            output_hidden_states=True,
            output_attentions=True,
        )
        # This is a just a regular PyTorch model.
        self.model = _from_pretrained(
            transformers.AutoModelForTokenClassification,
            model_name_or_path,
            config=model_config)
        self.model.eval()

## Some omitted snippets here

    def input_spec(self) -> lit_types.Spec:
        return {
            "token": lit_types.Tokens(),
            "label": lit_types.SequenceTags(align="token")
        }

    def output_spec(self) -> lit_types.Spec:
        return {
            "tokens": lit_types.Tokens(),
            "probas": lit_types.MulticlassPreds(parent="label", vocab=self.LABELS),
            "cls_emb": lit_types.Embeddings()

This actually seems to be expected behaviour.这实际上似乎是预期的行为。 In the documentation of the GPT models the HuggingFace team writes:GPT 模型的文档中, HuggingFace 团队写道:

This will issue a warning about some of the pretrained weights not being used and some weights being randomly initialized.这将发出有关未使用某些预训练权重和随机初始化某些权重的警告。 That's because we are throwing away the pretraining head of the BERT model to replace it with a classification head which is randomly initialized.那是因为我们丢弃了 BERT model 的预训练头,用随机初始化的分类头替换它。

So it seems to not be a problem for the fine-tuning.所以微调似乎不是问题。 In my use case described above it worked despite the warning as well.在我上面描述的用例中,尽管有警告,它仍然有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 RuntimeError,在 IA tryna 上工作使用预训练的 BERT 模型 - RuntimeError, working on IA tryna use a pre-trained BERT model 如何使用预训练的 BERT 模型进行下一句标注? - How to use pre-trained BERT model for next sentence labeling? 我可以使用BERT对具有预训练模型的短语进行聚类吗 - Could I use BERT to Cluster phrases with pre-trained model 无法加载 tensorflow BERT 预训练模型 - Failed to load tensorflow BERT pre-trained model 使用预先训练的举重训练 - Training with pre-trained weights 如何在 MLM 任务上训练 Tensorflow 的预训练 BERT? (仅在 Tensorflow 中使用预训练的 model) - How to train Tensorflow's pre trained BERT on MLM task? ( Use pre-trained model only in Tensorflow) 如何在我的 model 中使用预训练的 bert model 作为嵌入层? - How to using the pre-trained bert model as embedding layer in my model? 如何访问 Huggingface 的预训练 BERT model 的特定层? - How to access a particular layer of Huggingface's pre-trained BERT model? 如何在 HuggingFace Transformers 库中获取预训练的 BERT model 的中间层 output? - How to get intermediate layers' output of pre-trained BERT model in HuggingFace Transformers library? 如何在Keras中加载卷积神经网络前几层的权重并删除预训练的model? - How to load the weights of the first few layers of Convolutional Neural Network in Keras and delete the pre-trained model?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM