[英]How to get access to tokenzier after loading a saved custom BERT model using Keras and TF2?
I am working on Intent classification problem and need your help.我正在研究意图分类问题,需要您的帮助。
I fine-tuned one of the BERT model for text classification.我微调了其中一个 BERT model 用于文本分类。 Trained and evaluated it on a small dataset for detecting five intents.在一个小数据集上对其进行训练和评估,以检测五个意图。 I used the following code Intent Recognition with BERT using Keras and TensorFlow 2 It is working fine!我使用 Keras 和 TensorFlow 的 BERT 使用以下代码 Intent Recognition 2它工作正常! I have saved the model, so that I can use later on without retraining the model again in future.我已经保存了 model,这样我以后可以不用再重新训练 model 就可以使用了。
# Save the entire model as a SavedModel.
!mkdir -p saved_model
model.save('saved_model/intentclassifiermodel')
And zipped it and downloaded it to use it separately并将其压缩并下载以单独使用
!zip -r saved_model.zip saved_model/
Now, I am trying to use this model to predict the intent recognition.现在,我正在尝试使用这个 model 来预测意图识别。 For that I created another google colab notebook and loaded the model为此,我创建了另一个谷歌 colab 笔记本并加载了 model
from google.colab import drive
drive.mount('/content/gdrive')
!pip install tensorflow==2.2
!pip install bert-for-tf2 >> /dev/null
import bert
from tensorflow import keras
model = keras.models.load_model('/content/gdrive/MyDrive/NLPMODELS/saved_model/intentclassifiermodel')
model.summary()
The model is loaded successfully, now I want to predict. model 加载成功,现在我要预测一下。 For that I am using following code snippet (it was the same code in base code)为此,我正在使用以下代码片段(它与基本代码中的代码相同)
sentences = [
"are you a bot?",
"how to create a bot"
]
pred_tokens = map(tokenizer.tokenize, sentences)
pred_tokens = map(lambda tok: ["[CLS]"] + tok + ["[SEP]"], pred_tokens)
pred_token_ids = list(map(tokenizer.convert_tokens_to_ids, pred_tokens))
pred_token_ids = map(lambda tids: tids +[0]*(data.max_seq_len-len(tids)),pred_token_ids)
pred_token_ids = np.array(list(pred_token_ids))
predictions = model.predict(pred_token_ids).argmax(axis=-1)
for text, label in zip(sentences, predictions):
print("text:", text, "\nintent:", classes[label])
print()
**However, this code fails because I am not sure how to access the tokenizer here. **但是,此代码失败,因为我不确定如何在此处访问标记器。 ** **
Can you please help me how to get the tokenizer?你能帮我如何获得标记器吗?
Thanks and Regards, Rohit Dhamija谢谢和问候, Rohit Dhamija
Thank-you @AloneTogether for pointing out to SO谢谢@AloneTogether 指出SO
So, besides saving the model assest folder, I also saved the tokenizer.因此,除了保存 model 资产文件夹外,我还保存了标记器。
In order to make the code work, I required two additional things为了使代码正常工作,我需要另外两件事
data.max_seq_len and class values. data.max_seq_len 和 class 值。
For now, I extracted them while saving the model and used it in my program.现在,我在保存 model 的同时提取它们,并在我的程序中使用它。
Thanks!谢谢!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.