简体   繁体   English

我可以使用BERT对具有预训练模型的短语进行聚类吗

[英]Could I use BERT to Cluster phrases with pre-trained model

I found it was a failure that I had used Gensim with GoogleNews pre-trained model to cluster phrases like: 我发现将Gensim与GoogleNews的预训练模型结合使用来对短语进行聚类是失败的:

  • knitting 针织
  • knit loom 针织机
  • loom knitting 织机针织
  • weaving loom 织机
  • rainbow loom 彩虹织机
  • home decoration accessories 家居装饰配件
  • loom knit/knitting loom 织机/编织机
  • ... ...

I am advised that GoogleNews model does't have the phrases in it . 我被告知GoogleNews模型中没有短语 The phrases I have are a little specific to GoogleNews model while I don't have corpus to train a new model. 我所用的短语是GoogleNews模型所特有的,而我没有语料来训练新模型。 I have only the phrases. 我只有这些短语。 And now I am considering to turn to BERT. 现在我正在考虑求助于BERT。 But could BERT do that as I expected as above? 但是BERT可以像我上面期望的那样做吗? Thank you. 谢谢。

You can feed a phrase into the pretrained BERT model and get an embedding, ie a fixed-dimension vector. 您可以将短语输入到预训练的BERT模型中,并获得嵌入,即固定维向量。 So BERT can embed your phrases in a space. 因此,BERT可以将您的短语嵌入空格中。 Then you can use a clustering algorithm (such as k-means) to cluster the phrases. 然后,您可以使用聚类算法(例如k-means)对短语进行聚类。 The phrases do not need to occur in the training corpus of BERT, as long as the words they consist of are in the vocabulary. 短语不需要在BERT的训练语料中出现,只要它们组成的单词在词汇表中即可。 You will have to try to see if the embeddings give you relevant results. 您将不得不尝试查看嵌入是否为您提供了相关的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 RuntimeError,在 IA tryna 上工作使用预训练的 BERT 模型 - RuntimeError, working on IA tryna use a pre-trained BERT model 如何使用预训练的 BERT 模型进行下一句标注? - How to use pre-trained BERT model for next sentence labeling? 预训练的 BERT model 的权重未初始化 - Weights of pre-trained BERT model not initialized 如何在 MLM 任务上训练 Tensorflow 的预训练 BERT? (仅在 Tensorflow 中使用预训练的 model) - How to train Tensorflow's pre trained BERT on MLM task? ( Use pre-trained model only in Tensorflow) 无法加载 tensorflow BERT 预训练模型 - Failed to load tensorflow BERT pre-trained model 如何在TensorFlow中使用预训练模型 - How to use pre-trained model in TensorFlow 无法使用预训练的 model - Unable to use a pre-trained model 如何在我的 model 中使用预训练的 bert model 作为嵌入层? - How to using the pre-trained bert model as embedding layer in my model? 如何访问 Huggingface 的预训练 BERT model 的特定层? - How to access a particular layer of Huggingface's pre-trained BERT model? 如何在 HuggingFace Transformers 库中获取预训练的 BERT model 的中间层 output? - How to get intermediate layers' output of pre-trained BERT model in HuggingFace Transformers library?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM