简体   繁体   English

如何在 RF 中获取特征重要性

[英]How to get feature importance in RF

I am trying to get RF feature importance, I fit the random forest on the data like this:我正在尝试获取 RF 特征的重要性,我将随机森林拟合到这样的数据上:

model = RandomForestRegressor()
n = model.fit(self.X_train,self.y_train)
if n is not None:
   df = pd.DataFrame(data = n , columns = ["Feature","Importance_Score"])
   df["Feature_Name"] = np.array(self.X_Headers)
   df = df.drop(["Feature"], axis = 1)
   df[["Feature_Name","Importance_Score"]].to_csv("RF_Importances.csv", index = False)
   del df
            

However, the n variable returns None , why is this happening?但是, n变量返回None ,为什么会这样?

Not very sure how model.fit(self.X_train,self.y_train) is supposed to work.不太确定model.fit(self.X_train,self.y_train)应该如何工作。 Need more information about how you set up the model.需要有关如何设置 model 的更多信息。

If we set this up using simulated data, it works:如果我们使用模拟数据进行设置,它会起作用:

np.random.seed(111)
X = pd.DataFrame(np.random.normal(0,1,(100,5)),columns=['A','B','C','D','E'])
y = np.random.normal(0,1,100)

model = RandomForestRegressor()
n = model.fit(X,y)
if n is not None:
   df = pd.DataFrame({'features':X.columns,'importance':n.feature_importances_})

df
 
  features  importance
0        A    0.176091
1        B    0.183817
2        C    0.169927
3        D    0.267574
4        E    0.202591

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM