拥抱面（pytorch 变压器）上的 GPT2 运行时错误：只能为标量输出隐式创建 grad

Question

I am trying to fine-tune gpt2 with a custom dataset of mine.我正在尝试使用我的自定义数据集微调 gpt2。 I created a basic example with the documentation from hugging-face transformers.我用拥抱面变压器的文档创建了一个基本示例。 I receive the mentioned error.我收到提到的错误。 I know what it means: (basically it is calling backward on a non-scalar tensor) but since I almost use only API calls, I have no idea how to fix this issue.我知道这意味着什么：（基本上它是在非标量张量上向后调用）但由于我几乎只使用 API 调用，我不知道如何解决这个问题。 Any suggestions?有什么建议？

from pathlib import Path
from absl import flags, app
import IPython
import torch
from transformers import GPT2LMHeadModel, Trainer,  TrainingArguments
from data_reader import GetDataAsPython

# this is my custom data, but i get the same error for the basic case below
# data = GetDataAsPython('data.json')
# data = [data_point.GetText2Text() for data_point in data]
# print("Number of data samples is", len(data))

data = ["this is a trial text", "this is another trial text"]

train_texts = data

from transformers import GPT2Tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

special_tokens_dict = {'pad_token': '<PAD>'}
num_added_toks = tokenizer.add_special_tokens(special_tokens_dict)
train_encodigs = tokenizer(train_texts, truncation=True, padding=True)


class BugFixDataset(torch.utils.data.Dataset):
    def __init__(self, encodings):
        self.encodings = encodings
    
    def __getitem__(self, index):
        item = {key: torch.tensor(val[index]) for key, val in self.encodings.items()}
        return item

    def __len__(self):
        return len(self.encodings['input_ids'])

train_dataset = BugFixDataset(train_encodigs)

training_args = TrainingArguments(
    output_dir='./results',          
    num_train_epochs=3,              
    per_device_train_batch_size=1,  
    per_device_eval_batch_size=1,   
    warmup_steps=500,                
    weight_decay=0.01,               
    logging_dir='./logs',
    logging_steps=10,
)

model = GPT2LMHeadModel.from_pretrained('gpt2', return_dict=True)
model.resize_token_embeddings(len(tokenizer))

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
)

trainer.train()

Answer 1

I finally figured it out.我终于弄明白了。 The problem was that the data samples did not contain a target output.问题是数据样本不包含目标输出。 Even tough gpt is self-supervised, this has to be explicitly told to the model.即使是强硬的 gpt 也是自我监督的，这必须明确告知模型。

you have to add the line:您必须添加以下行：

item['labels'] = torch.tensor(self.encodings['input_ids'][index])

to the getitem function of the Dataset class and then it runs okay!到 Dataset 类的getitem函数，然后它运行正常！

拥抱面（pytorch 变压器）上的 GPT2 运行时错误：只能为标量输出隐式创建 grad

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-09-17 14:13:56

拥抱面（pytorch 变压器）上的 GPT2 运行时错误：只能为标量输出隐式创建 grad

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-09-17 14:13:56

解决方案1
0 已采纳 2020-09-17 14:13:56