简体   繁体   English

在 GEE 中解释随机森林的变量重要性

[英]Interpreting Variable Importance from Random Forest in GEE

This is more of a theoretical/function question.这更像是一个理论/功能问题。 I'm doing a land cover classification in Google Earth Engine using random forest and need to report Variable Importance.我正在使用随机森林在 Google Earth Engine 中进行土地覆盖分类,需要报告变量重要性。 Does anyone know how to interpret Variable Importance from random forest algorithm in GEE?有谁知道如何从 GEE 中的随机森林算法解释变量重要性?

In terms of code, I got importance by doing:在代码方面,我通过以下方式变得重要:

var RFmodel = ee.Classifier.smileRandomForest(1000).train(trainingData, 'classID', predictionBands);

var RFexp = RFmodel.explain()

var VarImp = ee.Feature(null, ee.Dictionary(RFexp).get('importance'));
print('Variable Importance:', VarImp)

However, the resulting importance values range from 0 to 60. This doesn't look like the importance measure "Mean Decrease in Accuracy" or "Gini Index" from the randomForest package in R (which I'm more familiar with).但是,生成的重要性值范围从 0 到 60。这看起来不像 R 中 randomForest 包(我更熟悉)中的重要性度量“准确度平均下降”或“基尼指数”。 So I guess I'm not really sure what these values mean in terms of variable "importance".所以我想我不太确定这些值在变量“重要性”方面的含义。 Can anyone please help me to understand this?谁能帮我理解这个?

Thanks in advance!提前致谢!

In GEE , I believe this is the sum of decrease in Gini impurity index over all trees in the forest.GEE中,我相信这是森林中所有树木的基尼杂质指数下降的总和。 In RI believe it is a weighted mean, so the difference is mean vs sum.在 RI 中,相信它是加权平均值,因此差异是平均值与总和。 In your code, it looks like you are using default 100 trees in GEE?在您的代码中,您似乎在 GEE 中使用默认的 100 棵树? Also see GEE code notes for further info, eg line 126. RandomForest.java#L120另请参阅 GEE 代码说明以获取更多信息,例如第 126 行。RandomForest.java#L120

Good luck, and I'm also interested in further explanation if someone can elaborate.祝你好运,如果有人可以详细说明,我也有兴趣进一步解释。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM