简体   繁体   English

TypeError: forward() 得到了一个意外的关键字参数 'input_ids'

[英]TypeError: forward() got an unexpected keyword argument 'input_ids'

I have pre-trained BERT model with my head.我已经用脑袋预训练了 BERT model。

I am using a fine-tuned Roberta Model that is unbiased-toxic-roberta trained on Jigsaw Data:我正在使用微调的 Roberta Model,它是在 Jigsaw Data 上训练的 unbiased-toxic-roberta:

https://huggingface.co/unitary/unbiased-toxic-roberta https://huggingface.co/unitary/unbiased-toxic-roberta

Creating the data using pytorch dataset使用 pytorch 数据集创建数据

tokenizer = tr.RobertaTokenizer.from_pretrained("/home/pc/unbiased_toxic_roberta")
train_encodings = tokenizer(train_texts, truncation=True, padding=True, max_length=512, return_tensors="pt")


class SEDataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['labels'] = torch.tensor(self.labels[idx])
        return item

    def __len__(self):
        return len(self.labels)

train_data = SEDataset(train_encodings, train_labels)



def compute_metrics(eval_pred):
    
    logits, labels = eval_pred
   

    predictions = np.argmax(logits, axis=-1)
    
    acc = np.sum(predictions == labels) / predictions.shape[0]
    
    return {"accuracy" : acc}

The model adding few layers on top of pre-trained model: model 在预训练的 model 之上添加了几层:

import torch.nn as nn
from transformers import AutoModel

class PosModel(nn.Module):
    def __init__(self):
        super(PosModel, self).__init__()
        
        self.base_model = tr.RobertaForSequenceClassification.from_pretrained('/home/pc/unbiased_toxic_roberta')
        self.dropout = nn.Dropout(0.5)
        self.linear = nn.Linear(768, 2) # output features from bert is 768 and 2 is ur number of labels
        
    def forward(self, input_ids, attn_mask):
        outputs = self.base_model(input_ids, attention_mask=attn_mask)
        # You write you new head here
        outputs = self.dropout(outputs[0])
        outputs = self.linear(outputs)
        
        return outputs

model = PosModel()

print(model)

Training Step:训练步骤:

Using the TrainingArguments to pass some parameters to the model使用 TrainingArguments 将一些参数传递给 model

training_args = tr.TrainingArguments(
#     report_to = 'wandb',
    output_dir='/home/pc/1_Proj_hate_speech/results_roberta',          # output directory
    overwrite_output_dir = True,
    num_train_epochs=20,              # total number of training epochs
    per_device_train_batch_size=16,  # batch size per device during training
    per_device_eval_batch_size=32,   # batch size for evaluation
    learning_rate=2e-5,
    warmup_steps=1000,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs3',            # directory for storing logs
    logging_steps=1000,
    evaluation_strategy="epoch"
    ,save_strategy="epoch"
    ,load_best_model_at_end=True
)


trainer = tr.Trainer(
    model=model,                         # the instantiated 🤗 Transformers model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_data,         # training dataset
    eval_dataset=val_data,             # evaluation dataset
    compute_metrics=compute_metrics
)

Running the model运行 model

trainer.train()

Error:错误:

TypeError: Caught TypeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/home/pc/.local/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/home/pc/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'input_ids'

It seems your tokenizer is adding the "input_ids" information when encoding the data, but the model doesn't expect this tensor on the input.似乎您的标记器在对数据进行编码时添加了“input_ids”信息,但 model 并不希望输入中出现此张量。 Maybe you can try to remove this data from train_encodings and try again.也许您可以尝试从train_encodings中删除此数据并重试。

I had the same issue, I made a function named 'model' and I was calling that function. I think you are doing the same thing in the last.我遇到了同样的问题,我做了一个名为“模型”的 function,我称它为 function。我认为你在最后做同样的事情。 Please check that.请检查一下。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 TypeError:getresponse()得到一个意外的关键字参数'buffering' - TypeError: getresponse() got an unexpected keyword argument 'buffering' TypeError:backward()得到了一个意外的关键字参数“变量” - TypeError: backward() got an unexpected keyword argument 'variables' TypeError:open()获得了意外的关键字参数“缓冲” - TypeError: open() got an unexpected keyword argument 'buffering' 类型错误:contact() 得到了意外的关键字参数“name” - TypeError: contact() got an unexpected keyword argument 'name' TypeError at '' __init__() 得到一个意外的关键字参数 '' - TypeError at '' __init__() got an unexpected keyword argument '' TypeError:conv1d() 在 Tf 1.13.1 上得到了一个意外的关键字参数“输入” - TypeError: conv1d() got an unexpected keyword argument 'input' on Tf 1.13.1 类型错误:__init__() 得到了一个意外的关键字参数“接收者” - TypeError: __init__() got an unexpected keyword argument 'recepient' 类型错误:init() 得到了一个意外的关键字参数“n_iter” - TypeError: init() got an unexpected keyword argument 'n_iter' Python错误“ TypeError:得到了意外的关键字参数'attachmentId'” - Python Error “TypeError: Got an unexpected keyword argument 'attachmentId'” 类型错误:_deserialize() 在棉花糖中有一个意外的关键字参数“部分” - TypeError: _deserialize() got an unexpected keyword argument 'partial' in marshmallow
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM