如何获得Gensim LDA中所有文档的document_topics分布？

Question

I'm new to python and I need to construct a LDA project. 我是python的新手，我需要构建一个LDA项目。 After doing some preprocessing step, here is my code: 经过一些预处理步骤后，这是我的代码：

dictionary = Dictionary(docs)
corpus = [dictionary.doc2bow(doc) for doc in docs]

from gensim.models import LdaModel
num_topics = 10
chunksize = 2000
passes = 20
iterations = 400
eval_every = None
temp = dictionary[0]
id2word = dictionary.id2token
model = LdaModel(corpus=corpus, id2word=id2word, chunksize=chunksize, \
                       alpha='auto', eta='auto', \
                       random_state=42, \
                       iterations=iterations, num_topics=num_topics, \
                       passes=passes, eval_every=eval_every)

I want to get a topic distribution of docs, all of the document and get 10 probability of topic distribution, but when I use: 我想获取文档，所有文档的主题分布，并获得10个主题分布的概率，但是当我使用时：

get_document_topics = model.get_document_topics(corpus)
print(get_document_topics)

The output only appear 输出仅出现

<gensim.interfaces.TransformedCorpus object at 0x000001DF28708E10>

How do I get a topic distribution of docs? 如何获得文档的主题分布？

Answer 1

The function get_document_topics takes an input of a single document in BOW format. 函数get_document_topics接受BOW格式的单个文档的输入。 You're calling it on the full corpus (an array of documents) so it returns an iterable object with the scores for each document. 您在完整的语料库（一组文档）上调用它，因此它返回一个带有每个文档分数的可迭代对象。

You have a few options. 您有几种选择。 If you just want one document, run it on the document you want the values for: 如果只需要一个文档，请在需要以下值的文档上运行它：

get_document_topics = model.get_document_topics(corpus[0])

or do the following to get an array of scores for all the documents: 或执行以下操作以获取所有文档的分数数组：

get_document_topics = [model.get_document_topics(item) for item in corpus]

Or directly access each object from your original code: 或直接从原始代码访问每个对象：

get_document_topics = model.get_document_topics(corpus)
print(get_document_topics[0])

如何获得Gensim LDA中所有文档的document_topics分布？

问题描述

1 个解决方案

解决方案1
1 2018-11-15 08:41:01

如何获得Gensim LDA中所有文档的document_topics分布？

问题描述

1 个解决方案

解决方案1 1 2018-11-15 08:41:01

解决方案1
1 2018-11-15 08:41:01