简体   繁体   English

主题与 Glove 词典的连贯性 (gensim)

[英]Topic Coherence with Dictionary from Glove (gensim)

I'm trying to evaluate a home-made topic model.我正在尝试评估自制的主题模型。 For this, I'm using the list of topics (represented by keywords), and want to use a gensim.models.coherencemodel.CoherenceModel , and call it on a corpus, which is a list of strings (each one being a document).为此,我正在使用主题列表(由关键字表示),并希望使用gensim.models.coherencemodel.CoherenceModel ,并在语料库中调用它,语料库是一个字符串列表(每个都是一个文档) . The CoherenceModel requires a Dictionary , but I don't understand what this corresponds to, and how I can get it. CoherenceModel需要一个Dictionary ,但我不明白这对应什么,以及如何获得它。 I'm using the TfidfVectorizer from sklearn to vectorize the text, and glove embeddings from gensim to compute similarities within my model.我使用的是TfidfVectorizersklearn向量化的文本, glove的嵌入从gensim来计算的相似性我的模型内。

From the docs, a Dictionary can be created from a corpus where the corpus is a list of lists of str .从文档中,可以从一个语料库中创建一个Dictionary ,其中该语料库是一个list of lists of str This same corpus should be passed in the text argument of the CoherenceModel .应该在CoherenceModeltext参数中传递相同的语料库。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM