简体   繁体   English

从lda gensim检索topic-word数组和document-topic数组

[英]retrieve topic-word array & document-topic array from lda gensim

Situation: 情况:

I have a numpy term-document matrix example: [[0,1,0,0....],....[......0,0,0,0]]. 我有一个numpy term-document矩阵示例:[[0,1,0,0 ....],.... [...... 0,0,0,0]]。

I have plugged in the above matrix to the ldamodel method of the gensim. 我已将上述矩阵插入到gensim的ldamodel方法中。 And it is working fine with the lad method lda = LdaModel(corpus, num_topics=10) . 并且它与lad方法lda = LdaModel(corpus, num_topics=10) corpus is my term-document matrix mentioned above. corpus是我上面提到的术语 - 文档矩阵。 I needed two intermediate matrices( topic-word array & document-topic array ) for research purpose. 我需要两个中间矩阵( 主题词阵列和文档主题数组 )用于研究目的。

1) per document-topic probability matrix (p_d_t) 1)每个文档主题概率矩阵(p_d_t)

2) per topic-word probability matrix (p_w_t) 2)每个主题 - 词概率矩阵(p_w_t)

Question: 题:

How to get those array from the gensim LdaModel() function.? 如何从gensim LdaModel()函数中获取这些数组。 Kindly help me with getting those matrices. 请帮助我获取这些矩阵。

1.Per-document topic probability matrix: 1.Per-document主题概率矩阵:

Apply a transformation to your corpus. 转换应用于您的语料库。

docTopicProbMat = lda[corpus]
  1. Per-topic word probability matrix: 每主题词概率矩阵:

K = lda.num_topics topicWordProbMat = lda.print_topics(K)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM