简体   繁体   中英

Why random_state parameter is used in NMF and LDA algorithm ? What are the benefits of using random topics generated every time?

For Topic Modelling ,
Why random_state parameter is used in NMF and LDA algorithm ?
What are the benefits of using random topics generated every time ?

The algorithms for both are stochastic - meaning they use randomness as a part of estimating a good answer. It's done that way to make it tractable, and in the case of LDA, the whole model is stochastic, providing you ideally with a probabilistic distribution (called "the posterior distribution") of answers, but instead providing a single, likely answer as an estimate.

So the answer is that using randomness in the algorithms makes a tremendously difficult problem much simpler and feasible to calculate in less than a hundred years.

If you're going to use them, I think it would do you well to study them, learn something of how they work, why they work. Using a tool that you don't understand is risky, as you don't really know what the result the tool provides actually means. One example is the numerors words in all "topics" with very low probability. The differences in these probabilities are actually meaningless - given a different sample from the posterior, you'd get different probabilities, ranked differently between words.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM