如何使用 FastText 查找相似词？

Question

I am playing around with FastText , https://pypi.python.org/pypi/fasttext ,which is quite similar to Word2Vec .我正在玩FastText ， https://pypi.python.org/pypi/fasttext ，它与Word2Vec非常相似。 Since it seems to be a pretty new library with not to many built in functions yet, I was wondering how to extract morphological similar words.由于它似乎是一个相当新的库，内置函数还不多，我想知道如何提取形态相似的词。

For eg: model.similar_word("dog") -> dogs.例如： model.similar_word("dog") -> 狗。 But there is no function built-in.但是没有内置function。

If I type model["dog"]如果我输入model["dog"]

I only get the vector, that might be used to compare cosine similarity.我只得到可能用于比较余弦相似度的向量。 model.cosine_similarity(model["dog"], model["dogs"]]) . model.cosine_similarity(model["dog"], model["dogs"]]) 。

Do I have to make some sort of loop and do cosine_similarity on all possible pairs in a text?我是否必须进行某种循环并对文本中所有可能的对进行cosine_similarity ？ That would take time...!!!那需要时间...!!!

Answer 1

使用 Gensim，使用 load.word2vec 模型加载 fastText 训练好的 .vec 文件，并使用 most_similiar() 方法找到相似的单词！

Answer 2

You can install pyfasttext library to extract the most similar or nearest words to a particualr word.您可以安装pyfasttext库来提取与特定单词最相似或最接近的单词。

from pyfasttext import FastText
model = FastText('model.bin')
model.nearest_neighbors('dog', k=2000)

Or you can get the latest development version of fasttext, you can install from thegithub repository :或者你可以得到fasttext的最新开发版本，你可以从github仓库安装：

import fasttext
model = fasttext.load_model('model.bin')
model.get_nearest_neighbors('dog', k=100)

Answer 3

You should use gensim to load the model.vec and then get similar words:您应该使用 gensim 加载model.vec然后得到类似的词：

m = gensim.models.Word2Vec.load_word2vec_format('model.vec')
m.most_similar(...)

Answer 4

You can install and import gensim library and then use gensim library to extract most similar words from the model that you downloaded from FastText .您可以安装并导入gensim库，然后使用 gensim 库从您从FastText下载的模型中提取最相似的词。

Use this:用这个：

import gensim
model = gensim.models.KeyedVectors.load_word2vec_format('model.vec')
similar = model.most_similar(positive=['man'],topn=10)

And by topn parameter you get the top 10 most similar words.通过 topn 参数，您可以获得前 10 个最相似的词。

Answer 5

Use gensim,使用gensim，

from gensim.models import FastText

model = FastText.load(PATH_TO_MODEL)
model.wv.most_similar(positive=['dog'])

More info here更多信息在这里

Answer 6

Fasttext has a method called get_nearest_neighbors. Fasttext 有一个名为 get_nearest_neighbors 的方法。 nearest neighbor queries . 最近邻查询。 One needs the model's.bin file to use this.需要模型的 .bin 文件才能使用它。

如何使用 FastText 查找相似词？

问题描述

6 个解决方案

解决方案1
15 已采纳 2017-02-15 18:36:44

解决方案2
9 2019-09-18 14:54:14

解决方案3
5 2017-02-14 09:50:19

解决方案4
4 2018-07-08 01:29:26

解决方案5
2 2021-01-03 02:39:36

解决方案6
0 2022-04-07 10:16:01

如何使用 FastText 查找相似词？

问题描述

6 个解决方案

解决方案1 15 已采纳 2017-02-15 18:36:44

解决方案2 9 2019-09-18 14:54:14

解决方案3 5 2017-02-14 09:50:19

解决方案4 4 2018-07-08 01:29:26

解决方案5 2 2021-01-03 02:39:36

解决方案6 0 2022-04-07 10:16:01

解决方案1
15 已采纳 2017-02-15 18:36:44

解决方案2
9 2019-09-18 14:54:14

解决方案3
5 2017-02-14 09:50:19

解决方案4
4 2018-07-08 01:29:26

解决方案5
2 2021-01-03 02:39:36

解决方案6
0 2022-04-07 10:16:01