How does LDA (Latent Dirichlet Allocation) inference from `gensim` work for a new data?

Question

I am training my ldamodel using gensim , and predicting using a test corpus like this ldamodel[doc_term_matrix_test] , it works just fine but I don't understand how the prediction is actually done using the trained model (what ldamodel[doc_term_matrix_test] is doing).

Here is the code :

dictionary2 = corpora.Dictionary(test)
dictionary = corpora.Dictionary(train)
dictionary.merge_with(dictionary2)
doc_term_matrix2 = [dictionary.doc2bow(doc) for doc in test]
doc_term_matrix = [dictionary.doc2bow(doc) for doc in train]
Lda = gensim.models.ldamodel.LdaModel
ldamodel = Lda(doc_term_matrix, num_topics=2, id2word = 
dictionary,random_state=100, iterations=50, passes=1)
topics = sorted(ldamodel[doc_term_matrix2],
                key=lambda 
                x:x[1],
                reverse=True)

Answer 1

To quote from gensim docs about ldamodel :

This module allows both LDA model estimation from a training corpus and inference of topic distribution on new, unseen documents.

So apparently, what your code does is not quite "prediction" but rather inference. That is, your trained LDA model yields for every test document T an estimation of the topic distribution of T .

How does LDA (Latent Dirichlet Allocation) inference from `gensim` work for a new data?

Question

1 answers

solution1
2 ACCPTED 2019-03-20 16:38:54

How does LDA (Latent Dirichlet Allocation) inference from `gensim` work for a new data?

Question

1 answers

solution1 2 ACCPTED 2019-03-20 16:38:54

solution1
2 ACCPTED 2019-03-20 16:38:54