[英]Topic Coherence with Dictionary from Glove (gensim)
I'm trying to evaluate a home-made topic model.我正在尝试评估自制的主题模型。 For this, I'm using the list of topics (represented by keywords), and want to use a gensim.models.coherencemodel.CoherenceModel
, and call it on a corpus, which is a list of strings (each one being a document).为此,我正在使用主题列表(由关键字表示),并希望使用gensim.models.coherencemodel.CoherenceModel
,并在语料库中调用它,语料库是一个字符串列表(每个都是一个文档) . The CoherenceModel
requires a Dictionary
, but I don't understand what this corresponds to, and how I can get it. CoherenceModel
需要一个Dictionary
,但我不明白这对应什么,以及如何获得它。 I'm using the TfidfVectorizer
from sklearn
to vectorize the text, and glove
embeddings from gensim
to compute similarities within my model.我使用的是TfidfVectorizer
从sklearn
向量化的文本, glove
的嵌入从gensim
来计算的相似性我的模型内。
From the docs, a Dictionary
can be created from a corpus where the corpus is a list of lists of str
.从文档中,可以从一个语料库中创建一个Dictionary
,其中该语料库是一个list of lists of str
。 This same corpus should be passed in the text
argument of the CoherenceModel
.应该在CoherenceModel
的text
参数中传递相同的语料库。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.