简体   繁体   English

随机状态参数在随机森林算法中的重要性是什么?

[英]What is the importance of random-state parameter in random forest algorithm?

Random forest has several parameters one them is Random-state. 随机森林有几个参数之一,它们是随机状态。 I dont know what it is what it does and how important it is for RF algorithm. 我不知道它是做什么的,以及它对RF算法的重要性。

Random forest are nothing but bagging applied on decision trees and we need random numbers for generating random samples (bootstrap samples) on which trees are fitted. 随机森林不过是套在决策树上的套袋而已,我们需要随机数来生成适合树的随机样本(自举样本)。 But we might have a problem here, each time you generate a set of random numbers the program will generate a completely different set of random numbers which again impacts your bootstrap samples and in turn the trees which are fitted. 但是我们这里可能会有问题,每次您生成一组随机数时,程序都会生成完全不同的一组随机数,这再次影响您的引导程序样本,进而影响所安装的树。 So in order to control the stochasticity involved in random number generation and to replicate the same set of random numbers every time we use a random seed. 因此,为了控制随机数生成所涉及的随机性,并在每次使用随机种子时复制同一组随机数。 And random_state is one parameter which allows you to set a random seed to your random number generation process in a random forest. random_state是一个参数,它允许您为随机森林中的随机数生成过程设置随机种子。

One main reason as to why need to set a random seed is for the purpose of replicability of the experiment. 关于为什么需要设置随机种子的一个主要原因是为了实验的可复制性。 It is always better to set a random seed and start building your model, so that each and every time you build the model with the same data you get the exact same model. 最好设置一个随机种子,然后开始构建模型,以便每次使用相同数据构建模型时,都获得完全相同的模型。

This idea of setting a random seed is not only restricted to random forest, any algorithm which required random number (Neural Networks, Decision Trees etc.) will have this parameter. 设置随机种子的想法不仅限于随机森林,任何需要随机数的算法(神经网络,决策树等)都将具有此参数。

Hope this helps! 希望这可以帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM