简体   繁体   English

训练后如何测试屏蔽语言model?

[英]How to test masked language model after training it?

I have followed this tutorial for masked language modelling from Hugging Face using BERT, but I am unsure how to actually deploy the model.我已经按照本教程使用 BERT 从 Hugging Face 进行蒙面语言建模,但我不确定如何实际部署 model。

Tutorial: https://github.com/huggingface/notebooks/blob/master/examples/language_modeling.ipynb教程: https://github.com/huggingface/notebooks/blob/master/examples/language_modeling.ipynb

I have trained the model using my own dataset, which has worked fine, but I don't know how to actually use the model, as the notebook does not include an example on how to do this, sadly.我已经使用我自己的数据集训练了 model,效果很好,但我不知道如何实际使用 model,因为遗憾的是,笔记本没有包含如何执行此操作的示例。

Example of what I want to do with my trained model我想用我训练有素的 model 做的例子

On the Hugging Face website, this is the code used in the example;在 Hugging Face 网站上,这是示例中使用的代码; hence, I want to do this exact thing but with my model:因此,我想用我的 model 做这件事:

>>> from transformers import pipeline
>>> unmasker = pipeline('fill-mask', model='bert-base-uncased')
>>> unmasker("Hello I'm a [MASK] model.")

[{'sequence': "[CLS] hello i'm a fashion model. [SEP]",
  'score': 0.1073106899857521,
  'token': 4827,
  'token_str': 'fashion'},
 {'sequence': "[CLS] hello i'm a role model. [SEP]",
  'score': 0.08774490654468536,
  'token': 2535,
  'token_str': 'role'},
 {'sequence': "[CLS] hello i'm a new model. [SEP]",
  'score': 0.05338378623127937,
  'token': 2047,
  'token_str': 'new'},
 {'sequence': "[CLS] hello i'm a super model. [SEP]",
  'score': 0.04667217284440994,
  'token': 3565,
  'token_str': 'super'},
 {'sequence': "[CLS] hello i'm a fine model. [SEP]",
  'score': 0.027095865458250046,
  'token': 2986,
  'token_str': 'fine'}

Any help on how to do this would be great.关于如何做到这一点的任何帮助都会很棒。

This depends a lot of your task.这很大程度上取决于您的任务。 Your task seems to be masked language modelling, that, is to predict one or more masked words:您的任务似乎是蒙面语言建模,即预测一个或多个蒙面词:

today I ate ___.今天我吃了___。

(pizza) or (pasta) could be equally correct, so you cannot use a metric such as accuray. (pizza) 或 (pasta) 可能同样正确,因此您不能使用准确度等指标。 But (water) should be less "correct" than the other two.但是(水)应该比其他两个更“正确”。 So what you normally do is to check how "surprised" the language model is, on an evaluation data set.因此,您通常要做的是检查语言 model 在评估数据集上的“惊讶”程度。 This metric is called perplexity .这个度量被称为perplexity Therefore, before and after you finetune a model on you specific dataset, you would calculate the perplexity and you would expect it to be lower after finetuning.因此,在您对特定数据集微调 model 之前和之后,您将计算困惑度,并且您会期望微调后它会更低。 The model should be more used to your specific vocabulary etc. And that is how you test your model. model 应该更适合您的特定词汇等。这就是您测试model 的方式。

As you can see, they calculate the perplexity in the tutorial you mentioned:如您所见,他们在您提到的教程中计算了困惑度:

import math
eval_results = trainer.evaluate()
print(f"Perplexity: {math.exp(eval_results['eval_loss']):.2f}") 

To predict samples, you need to tokenize those samples and prepare the input for the model.预测样本,您需要标记这些样本并为 model 准备输入。 The Fill-mask-Pipeline can do this for you: Fill-mask-Pipeline 可以为您执行此操作:

# if you trained your model on gpu you need to add this line:
trainer.model.to('cpu')

unmasker = pipeline('fill-mask', model=trainer.model, tokenizer=tokenizer)
unmasker("today I ate <mask>")

which results in the following output:这导致以下 output:

[{'score': 0.23618391156196594,
  'sequence': 'today I ate it.',
  'token': 24,
  'token_str': ' it'},
 {'score': 0.03940323367714882,
  'sequence': 'today I ate breakfast.',
  'token': 7080,
  'token_str': ' breakfast'},
 {'score': 0.033759087324142456,
  'sequence': 'today I ate lunch.',
  'token': 4592,
  'token_str': ' lunch'},
 {'score': 0.025962186977267265,
  'sequence': 'today I ate pizza.',
  'token': 9366,
  'token_str': ' pizza'},
 {'score': 0.01913984678685665,
  'sequence': 'today I ate them.',
  'token': 106,
  'token_str': ' them'}]

Closely related to perplexity, and a bit more specific to masked language model evaluation: https://aclanthology.org/2020.acl-main.240.pdf与困惑密切相关,并且更具体到掩码语言 model 评估: https://aclanthology.org/2020.acl-main.240.Z437175BA41913410EE094E1D9

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM