简体   繁体   中英

Using LDA Topic Models as a Classification Model Input

I made the LDA model to make topic model using big training data sets. So, I try to use this LDA model to classification using new sentence which it doesn't use in the training data set.

How I can find the most closet topic number using a new input sentence?

Should I use LDA Topic Models as a Classification Model Input?

Welcome to share example code using Python.

In classification problems, since the ground-truth label is known, we only need to consider how to extract features from the training data. For LDA, the features are usually the topic probability distribution, ie if there are 5 topics in the corpus, then the dimension of the feature vector is 5, and that should be a better feature than the closet topic number (the most probable topic).

For how to get topic probability distribution for new input sentences, you can take a look at here , for other packages, they should also have similar functions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM