[英]retrieve topic-word array & document-topic array from lda gensim
Situation: 情况:
I have a numpy term-document matrix example: [[0,1,0,0....],....[......0,0,0,0]]. 我有一个numpy term-document矩阵示例:[[0,1,0,0 ....],.... [...... 0,0,0,0]]。
I have plugged in the above matrix to the ldamodel method of the gensim. 我已将上述矩阵插入到gensim的ldamodel方法中。 And it is working fine with the lad method lda = LdaModel(corpus, num_topics=10)
. 并且它与lad方法lda = LdaModel(corpus, num_topics=10)
。 corpus
is my term-document matrix mentioned above. corpus
是我上面提到的术语 - 文档矩阵。 I needed two intermediate matrices( topic-word array & document-topic array ) for research purpose. 我需要两个中间矩阵( 主题词阵列和文档主题数组 )用于研究目的。
1) per document-topic probability matrix (p_d_t) 1)每个文档主题概率矩阵(p_d_t)
2) per topic-word probability matrix (p_w_t) 2)每个主题 - 词概率矩阵(p_w_t)
Question: 题:
How to get those array from the gensim LdaModel()
function.? 如何从gensim LdaModel()
函数中获取这些数组。 Kindly help me with getting those matrices. 请帮助我获取这些矩阵。
1.Per-document topic probability matrix: 1.Per-document主题概率矩阵:
Apply a transformation to your corpus. 将转换应用于您的语料库。
docTopicProbMat = lda[corpus]
K = lda.num_topics topicWordProbMat = lda.print_topics(K)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.