简体   繁体   中英

How to determine which document falls under a particular topic after applying topic modelling techniques like NMF, LDA, BERTopic?

Is there any way I can map generated topic from LDA, NMF and BERTopic to the list of documents and identify to which topic it belongs to? Click here to view Example

I am not an expert in NMF and tried LDA 3-4 years ago. However, I have an idea about BERTopic. In BERTopic when you fit the data, you get two outputs topics and probs (if you set calculate_probabilities=True). Using topics you easily get which document is assigned to which topic. For Example: topic_model = BERTopic(calculate_probabilities=True) topics, probs = topic_model.fit_transform(documents) print(topics)

Example: number of documents=10, number of topics retrieved=3 (-1,0,1) when we print the topics, the output is[1, 0, -1, -1, 0, 0, 0, 1, 0, 1], means document0 is assigned to topic 1, document1 is assigned to topic 0, document3 is assigned to topic -1 (ie outlier) and so on. Hope it helps a bit

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM