简体   繁体   English

主题建模 - 计算 sklearn LDA model 的一致性分数?

[英]Topic modelling- Calculate the coherence score of an sklearn LDA model?

I tried several things to calculate the coherence score for a sklearn LDA model, but it does not work out.我尝试了几种方法来计算 sklearn LDA model 的一致性分数,但没有成功。 What is a way to calculate the Coherence score for a sklearn LDA model?计算 sklearn LDA model 的 Coherence 分数的方法是什么?

When I use the standard gensim code to calculate the coherence score, I receive the following error: ValueError: This topic model is not currently supported.当我使用标准的 gensim 代码计算一致性分数时,我收到以下错误:ValueError: This topic model is not currently supported。 Supported topic models should implement the get_topics method.```支持的主题模型应该实现get_topics方法。```

Here is part of my code:这是我的代码的一部分:

count_vectorizer = CountVectorizer(stop_words='english')

  # Fit and transform the processed titles

count_data = count_vectorizer.fit_transform(training_data_preprocessed['Input'])
tf = count_data




number_topics = 5
number_words = 5

# Create and fit the LDA model
lda = LDA(n_components=number_topics)
lda.fit(tf)

# Print the topics found by the LDA model
print("Topics found via LDA:")
print_topics(lda, count_vectorizer, number_words)

I think you can use this code below for coherence model in LDA:我认为您可以使用下面的代码在 LDA 中实现连贯性 model:

# import library from gensim  
from gensim.models import CoherenceModel

# instantiate topic coherence model
cm = CoherenceModel(model=lda_model_15, corpus=bow_corpus, texts=docs, coherence='c_v')

# get topic coherence score
coherence_lda = cm.get_coherence() 
print(coherence_lda)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM