[英]How to use fine-tuned BERT model for sentence encoding?
I fine-tuned the BERT base model on my own dataset following the script here:我按照此处的脚本在我自己的数据集上微调了 BERT 基础 model:
https://github.com/cedrickchee/pytorch-pretrained-BERT/tree/master/examples/lm_finetuning https://github.com/cedrickchee/pytorch-pretrained-BERT/tree/master/examples/lm_finetuning
I saved the model as a .pt
file and I want to use it now for a sentence similarity task.我将 model 保存为.pt
文件,现在我想将其用于句子相似性任务。 Unfortunately, it is not clear to me, how to load the fine-tuned model.不幸的是,我不清楚如何加载经过微调的 model。 I tried the following:我尝试了以下方法:
model = BertModel.from_pretrained('trained_model.pt')
model.eval()
This doesn't work.这行不通。 It says:它说:
ReadError: not a gzip file
So apparently, loading a .pt
file with the from_pretrained
method is not possible.显然,使用from_pretrained
方法加载.pt
文件是不可能的。 Can anyone help me out here?有谁可以帮我离开这里吗? Thank's a lot:!非常感谢:! :) :)
Edit: I saved the model in a s3 bucket as follows:编辑:我将 model 保存在 s3 存储桶中,如下所示:
# Convert model to buffer
buffer = io.BytesIO()
torch.save(model, buffer)
# Save in s3 bucket
output_model_file = output_folder + "trained_model.pt"
s3_.put_object(Bucket="power-plant-embeddings", Key=output_model_file, Body=buffer.getvalue())
To load a model with BertModel.from_pretrained()
you need to have saved it using save_pretrained()
(link) .要使用BertModel.from_pretrained()
加载 model,您需要使用save_pretrained()
(链接)保存它。
Any other storage method would require the corresponding load.任何其他存储方法都需要相应的负载。 I am not familiar with S3, but I assume you can use get_object
(link) to retrieve the model, and then save it using the huggingface api.我不熟悉 S3,但我假设您可以使用get_object
(link)来检索 model,然后使用 huggingface api 保存它。 From then on, you should be able to use from_pretrained()
normally.从那时起,您应该可以正常使用from_pretrained()
了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.