简体   繁体   English

如何从 Tidymodels ranger object 中获取变量/特征重要性?

[英]How to Get Variable/Feature Importance From Tidymodels ranger object?

I have a ranger object from the tidymodels rand_forest function:我有一个来自 tidymodels rand_forest function 的游侠 object:

rf <- rand_forest(mode = "regression", trees = 1000) %>% fit(pay_rate ~ age+profession)

I want to get the feature importance of each variable (I have many more than in this example).我想获得每个变量的特征重要性(我比这个例子中的要多得多)。 I've tried things like rf$variable.importance , or importance(rf) , but the former returns NULL and the latter function doesn't exist.我已经尝试过rf$variable.importanceimportance(rf)之类的东西,但前者返回NULL而后者 function 不存在。 I tried using the vip package, but that doesn't work for a ranger object.我尝试使用vip package,但这不适用于游侠 object。 How can I extract feature importances from this object?如何从此 object 中提取特征重要性?

You need to add importance = "impurity" when you set the engine for ranger.当您为游侠设置引擎时,您需要添加importance = "impurity" This will provide variable importance scores.这将提供可变的重要性分数。 Once this is set, you can use extract_fit_parsnip with vip to plot the variable importance.设置完成后,您可以将extract_fit_parsnipvip一起用于 plot 变量重要性。

small example:小例子:

library(tidymodels)
library(vip)

rf_mod <- rand_forest(mode = "regression", trees = 100) %>% 
  set_engine("ranger", importance = "impurity")
  
rf_recipe <- 
  recipe(mpg ~ ., data = mtcars) 

rf_workflow <- 
  workflow() %>% 
  add_model(rf_mod) %>% 
  add_recipe(rf_recipe)


rf_workflow %>% 
  fit(mtcars) %>% 
  extract_fit_parsnip() %>% 
  vip(num_features = 10)

More information is available in the tidymodels get started guide tidymodels入门指南中提供了更多信息

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM