简体繁体中英

Extract Word Saliency from Gensim LDA or pyLDAvis

原文 2021-10-15 01:46:19 2 1 gensim/ lda/ topic-modeling/ pyldavis

I see that pyLDAvis visualize each word's saliency under each topic.

But do we have a way to extract each word's saliency under each topic? Or how to calculate each word's saliency directly using Gensim LDA?

So finally, I want to get a pandas dataframe such that one row represents one word, each column represents each topic and its value represents the word's saliency under the corresponding topic.

Many thanks in advance.

1 answers

Gensim's LDA support does not have out-of-the-box support for this particular 'saliency' calculation from Chuang et al (2012).

Still, I suspect the model's .get_term_topics() and/or .get_topic_terms() methods are the proper supporting data for implementing that calculation. In particular, one or the other of those methods might provide the p( w | t ) term, but a deeper read of the paper would be required to know for sure. (I suspect the P(t) term might require a separate survey of the training data.)

From the class docs:

https://radimrehurek.com/gensim/models/ldamodel.html#gensim.models.ldamodel.LdaModel.get_term_topics

Returns The relevant topics represented as pairs of their ID and their assigned probability, sorted by relevance to the given word.

https://radimrehurek.com/gensim/models/ldamodel.html#gensim.models.ldamodel.LdaModel.get_topic_terms

Returns Word ID - probability pairs for the most relevant words generated by the topic.

I hadn't come across this particular 'saliency' calculation before, but if it is popular among LDA users, or of potential general use, and you figure out how to calculate it, it'd likely be a welcome contribution to the Gensim project - especially if it can be a simple extra convenience method on LdaModel .

is possible to extract bow from gensim lda model

Is there any way to match Gensim LDA output with topics in pyLDAvis graph?

How to remove a word in LDA analysis by gensim

Extract Topic Scores for Documents LDA Gensim Python

pyLDAvis visualization from gensim not displaying the result in google colab

retrieve topic-word array & document-topic array from lda gensim

LDA Gensim Word -> Topic Ids Distribution instead of Topic -> Word Distribution

What is the impact of word frequency on Gensim LDA Topic modelling

How to get the topic-word probabilities of a given word in gensim LDA?

Extracting Topic distribution from gensim LDA model

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question is possible to extract bow from gensim lda model Is there any way to match Gensim LDA output with topics in pyLDAvis graph? How to remove a word in LDA analysis by gensim Extract Topic Scores for Documents LDA Gensim Python pyLDAvis visualization from gensim not displaying the result in google colab retrieve topic-word array & document-topic array from lda gensim LDA Gensim Word -> Topic Ids Distribution instead of Topic -> Word Distribution What is the impact of word frequency on Gensim LDA Topic modelling How to get the topic-word probabilities of a given word in gensim LDA? Extracting Topic distribution from gensim LDA model

Related Tags

Extract Word Saliency from Gensim LDA or pyLDAvis

Question

1 answers

solution1 0 2021-10-15 03:27:38

solution1
0 2021-10-15 03:27:38