简体   繁体   中英

What is the importance of random-state parameter in random forest algorithm?

Random forest has several parameters one them is Random-state. I dont know what it is what it does and how important it is for RF algorithm.

Random forest are nothing but bagging applied on decision trees and we need random numbers for generating random samples (bootstrap samples) on which trees are fitted. But we might have a problem here, each time you generate a set of random numbers the program will generate a completely different set of random numbers which again impacts your bootstrap samples and in turn the trees which are fitted. So in order to control the stochasticity involved in random number generation and to replicate the same set of random numbers every time we use a random seed. And random_state is one parameter which allows you to set a random seed to your random number generation process in a random forest.

One main reason as to why need to set a random seed is for the purpose of replicability of the experiment. It is always better to set a random seed and start building your model, so that each and every time you build the model with the same data you get the exact same model.

This idea of setting a random seed is not only restricted to random forest, any algorithm which required random number (Neural Networks, Decision Trees etc.) will have this parameter.

Hope this helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM