![](/img/trans.png)
[英]How to download a HuggingFace model 'transformers.trainer.Trainer'?
[英]How to get the accuracy per epoch or step for the huggingface.transformers Trainer?
我将 huggingface Trainer 与 BertForSequenceClassification.from_pretrained("bert-base-uncased") model 一起使用。
简化后,它看起来像这样:
model = BertForSequenceClassification.from_pretrained("bert-base-uncased")
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
training_args = TrainingArguments(
output_dir="bert_results",
num_train_epochs=3,
per_device_train_batch_size=8,
per_device_eval_batch_size=32,
warmup_steps=500,
weight_decay=0.01,
logging_dir="bert_results/logs",
logging_steps=10
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=val_dataset,
compute_metrics=compute_metrics
)
日志包含每 10 步的损失,但我似乎无法找到训练的准确性。 有谁知道如何获得准确性,例如通过更改记录器的详细程度? 我似乎无法在网上找到任何相关信息。
谢谢,CptBaas
您可以使用evaluation_strategy训练参数确定 Trainer 的评估间隔。 它目前接受 3 个值:
“否”:训练期间不进行评估。
“步骤”:每个 eval_steps 都会完成(并记录)评估。
“epoch”:在每个 epoch 结束时进行评估。
您可以加载准确度指标并使其与您的compute_metrics
function 一起使用。 例如,它会像:
from datasets import load_metric
metric = load_metric('accuracy')
def compute_metrics(eval_pred):
predictions, labels = eval_pred
predictions = np.argmax(predictions, axis=1)
return metric.compute(predictions=predictions, references=labels)
这个compute_metrics
function 示例基于Hugging Face 的文本分类教程。 它在我的测试中有效。
我遇到了同样的问题,我通过添加一个自定义回调解决了这个问题,该回调在每个回调结束时使用 train_dataset 调用 evaluate() 方法。
class CustomCallback(TrainerCallback):
def __init__(self, trainer) -> None:
super().__init__()
self._trainer = trainer
def on_epoch_end(self, args, state, control, **kwargs):
if control.should_evaluate:
control_copy = deepcopy(control)
self._trainer.evaluate(eval_dataset=self._trainer.train_dataset, metric_key_prefix="train")
return control_copy
trainer = Trainer(
model=model, # the instantiated Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_dataset, # training dataset
eval_dataset=valid_dataset, # evaluation dataset
compute_metrics=compute_metrics, # the callback that computes metrics of interest
tokenizer=tokenizer
)
trainer.add_callback(CustomCallback(trainer))
train = trainer.train()
这给出了如下的训练指标:
{'train_loss': 0.7159061431884766, 'train_accuracy': 0.4, 'train_f1': 0.5714285714285715, 'train_runtime': 6.2973, 'train_samples_per_second': 2.382, 'train_steps_per_second': 0.159, 'epoch': 1.0}
{'eval_loss': 0.8529007434844971, 'eval_accuracy': 0.0, 'eval_f1': 0.0, 'eval_runtime': 2.0739, 'eval_samples_per_second': 0.964, 'eval_steps_per_second': 0.482, 'epoch': 1.0}
获得训练精度的另一种方法是扩展基本 Trainer class 并覆盖 compute_loss() 方法,如下所示:
class CustomTrainer(Trainer):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def compute_loss(self, model, inputs, return_outputs=False):
"""
How the loss is computed by Trainer. By default, all models return the loss in the first element.
Subclass and override for custom behavior.
"""
if self.label_smoother is not None and "labels" in inputs:
labels = inputs.pop("labels")
else:
labels = None
outputs = model(**inputs)
# code for calculating accuracy
if "labels" in inputs:
preds = outputs.logits.detach()
acc1 = accuracy_score(inputs.labels.reshape(1, len(inputs.labels))[0], preds.argmax(axis=1))
self.log({'accuracy_score': acc1})
acc = (
(preds.argmax(axis=-1) == inputs.labels.reshape(1, len(inputs.labels))[0])
.type(torch.float)
.mean()
.item()
)
self.log({"train_accuracy": acc})
# end code for calculating accuracy
# Save past state if it exists
# TODO: this needs to be fixed and made cleaner later.
if self.args.past_index >= 0:
self._past = outputs[self.args.past_index]
if labels is not None:
loss = self.label_smoother(outputs, labels)
else:
# We don't use .loss here since the model may return tuples instead of ModelOutput.
loss = outputs["loss"] if isinstance(outputs, dict) else outputs[0]
return (loss, outputs) if return_outputs else loss
然后像这样使用 CustomTrainer 而不是培训师:
trainer = CustomTrainer(
model=model, # the instantiated Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_dataset, # training dataset
eval_dataset=valid_dataset, # evaluation dataset
compute_metrics=compute_metrics, # the callback that computes metrics of interest
tokenizer=tokenizer
)
需要 Function 返回所需的指标。 这是我写的,它返回指标列表(越多越好,对吧?):
def compute_metrics(eval_pred):
metrics = ["accuracy", "recall", "precision", "f1"] #List of metrics to return
metric={}
for met in metrics:
metric[met] = load_metric(met)
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)
metric_res={}
for met in metrics:
metric_res[met]=metric[met].compute(predictions=predictions, references=labels)[met]
return metric_res
此外,如果需要按 epoch 计算指标,则需要在训练参数中定义它:
training_args = TrainingArguments(
...,
evaluation_strategy = "epoch", #To calculate metrics per epoch
logging_strategy="epoch", #Extra: to log training data stats for loss
)
最后一步是将其添加到训练器中:
trainer = Trainer(
...,
compute_metrics=compute_metrics,
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.