[英]Topic modelling- Calculate the coherence score of an sklearn LDA model?
我嘗試了幾種方法來計算 sklearn LDA model 的一致性分數,但沒有成功。 計算 sklearn LDA model 的 Coherence 分數的方法是什么?
當我使用標准的 gensim 代碼計算一致性分數時,我收到以下錯誤:ValueError: This topic model is not currently supported。 支持的主題模型應該實現get_topics
方法。```
這是我的代碼的一部分:
count_vectorizer = CountVectorizer(stop_words='english')
# Fit and transform the processed titles
count_data = count_vectorizer.fit_transform(training_data_preprocessed['Input'])
tf = count_data
number_topics = 5
number_words = 5
# Create and fit the LDA model
lda = LDA(n_components=number_topics)
lda.fit(tf)
# Print the topics found by the LDA model
print("Topics found via LDA:")
print_topics(lda, count_vectorizer, number_words)
我認為您可以使用下面的代碼在 LDA 中實現連貫性 model:
# import library from gensim
from gensim.models import CoherenceModel
# instantiate topic coherence model
cm = CoherenceModel(model=lda_model_15, corpus=bow_corpus, texts=docs, coherence='c_v')
# get topic coherence score
coherence_lda = cm.get_coherence()
print(coherence_lda)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.