使用 Python 的随机森林特征重要性

Question

我正在尝试下面的随机森林分类器代码。 即使我已经定义但得到 NameError。 请帮忙

def RFC_model(randomState, X_train, X_test, y_train, y_test):


   rand_forest = RandomForestClassifier()
   rand_forest.fit(X_train, y_train)
   forest_test_predictions = rand_forest.predict(X_test)
   print(accuracy_score(y_test, forest_test_predictions))

X_train, X_test, y_train, y_test = train_test_split(df_encoded.drop(['success'],axis='columns').values,      
                                                df_encoded.success, 
                                                test_size=0.2)

RFC_model(42, X_train, X_test, y_train, y_test)

0.994045375744328

rand_forest.feature_importances_.round(3)

NameError                                 Traceback (most recent call last)
<ipython-input-40-974786899b7f> in <module>
  1 #importance of features rounded to nearest 3 decimals
----> 2 rand_forest.feature_importances_.round(3)

NameError: name 'rand_forest' is not defined

Answer 1

您正在RFC_model函数的范围内本地定义变量rand_forest 。 一旦函数完成执行，对象就会被销毁，因此您无法访问它。 您可以通过返回rand_forest对象来解决此问题：

def RFC_model(randomState, X_train, X_test, y_train, y_test):
    rand_forest = RandomForestClassifier()
    rand_forest.fit(X_train, y_train)
    forest_test_predictions = rand_forest.predict(X_test)
    print(accuracy_score(y_test, forest_test_predictions))
    return rand_forest

X_train, X_test, y_train, y_test = train_test_split(df_encoded.drop(['success'],axis='columns').values,      
                                            df_encoded.success, 
                                            test_size=0.2)

rand_forest = RFC_model(42, X_train, X_test, y_train, y_test)
rand_forest.feature_importances_.round(3)

使用 Python 的随机森林特征重要性

问题描述

1 个解决方案

解决方案1
1 2019-12-12 13:09:41

使用 Python 的随机森林特征重要性

问题描述

1 个解决方案

解决方案1 1 2019-12-12 13:09:41

解决方案1
1 2019-12-12 13:09:41