简体   繁体   English

拥抱面(pytorch 变压器)上的 GPT2 运行时错误:只能为标量输出隐式创建 grad

[英]GPT2 on Hugging face(pytorch transformers) RuntimeError: grad can be implicitly created only for scalar outputs

I am trying to fine-tune gpt2 with a custom dataset of mine.我正在尝试使用我的自定义数据集微调 gpt2。 I created a basic example with the documentation from hugging-face transformers.我用拥抱面变压器的文档创建了一个基本示例。 I receive the mentioned error.我收到提到的错误。 I know what it means: (basically it is calling backward on a non-scalar tensor) but since I almost use only API calls, I have no idea how to fix this issue.我知道这意味着什么:(基本上它是在非标量张量上向后调用)但由于我几乎只使用 API 调用,我不知道如何解决这个问题。 Any suggestions?有什么建议?

from pathlib import Path
from absl import flags, app
import IPython
import torch
from transformers import GPT2LMHeadModel, Trainer,  TrainingArguments
from data_reader import GetDataAsPython

# this is my custom data, but i get the same error for the basic case below
# data = GetDataAsPython('data.json')
# data = [data_point.GetText2Text() for data_point in data]
# print("Number of data samples is", len(data))

data = ["this is a trial text", "this is another trial text"]

train_texts = data

from transformers import GPT2Tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

special_tokens_dict = {'pad_token': '<PAD>'}
num_added_toks = tokenizer.add_special_tokens(special_tokens_dict)
train_encodigs = tokenizer(train_texts, truncation=True, padding=True)


class BugFixDataset(torch.utils.data.Dataset):
    def __init__(self, encodings):
        self.encodings = encodings
    
    def __getitem__(self, index):
        item = {key: torch.tensor(val[index]) for key, val in self.encodings.items()}
        return item

    def __len__(self):
        return len(self.encodings['input_ids'])

train_dataset = BugFixDataset(train_encodigs)

training_args = TrainingArguments(
    output_dir='./results',          
    num_train_epochs=3,              
    per_device_train_batch_size=1,  
    per_device_eval_batch_size=1,   
    warmup_steps=500,                
    weight_decay=0.01,               
    logging_dir='./logs',
    logging_steps=10,
)

model = GPT2LMHeadModel.from_pretrained('gpt2', return_dict=True)
model.resize_token_embeddings(len(tokenizer))

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
)

trainer.train()

I finally figured it out.我终于弄明白了。 The problem was that the data samples did not contain a target output.问题是数据样本不包含目标输出。 Even tough gpt is self-supervised, this has to be explicitly told to the model.即使是强硬的 gpt 也是自我监督的,这必须明确告知模型。

you have to add the line:您必须添加以下行:

item['labels'] = torch.tensor(self.encodings['input_ids'][index])

to the getitem function of the Dataset class and then it runs okay!到 Dataset 类的getitem函数,然后它运行正常!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 RuntimeError(“只能为标量输出隐式创建grad”) - RuntimeError(“grad can be implicitly created only for scalar outputs”) PyTorch autograd - 只能为标量输出隐式创建grad - PyTorch autograd — grad can be implicitly created only for scalar outputs Hugging face - GPT2 中未知代币的高效代币化 - Hugging face - Efficient tokenization of unknown token in GPT2 无法导入拥抱脸转换器 - Unable to import Hugging Face transformers 使用 Hugging Face Transformers 库如何 POS_TAG 法语文本 - Using Hugging Face Transformers library how can you POS_TAG French text Tensorflow 2.0 拥抱面变压器,TFBertForSequenceClassification,意外 Output 推理中的尺寸 - Tensorflow 2.0 Hugging Face Transformers, TFBertForSequenceClassification, Unexpected Output Dimensions in Inference 减少拥抱面转换器 (BERT) 中隐藏单元的数量 - Reduce the number of hidden units in hugging face transformers (BERT) 使用 Huggingface 的 pytorch-transformers GPT-2 进行分类任务 - using huggingface's pytorch- transformers GPT-2 for classifcation tasks Huggin Face Conversational error: error: argument --model: invalid choice: &#39;models/&#39; (select from &#39;openai-gpt&#39;, &#39;gpt2&#39;) - Huggin Face Conversational error: error: argument --model: invalid choice: 'models/' (choose from 'openai-gpt', 'gpt2') 缺少 GPT-2 微调的脚本,以及 Hugging-face GitHub 中的推断? - Scripts missing for GPT-2 fine tune, and inference in Hugging-face GitHub?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM