簡體   English   中英

'numpy.ndarray' 對象沒有屬性 'columns'

[英]'numpy.ndarray' object has no attribute 'columns'

我試圖找出隨機森林分類任務的特征重要性。 但它給了我以下錯誤:

'numpy.ndarray' 對象沒有屬性 'columns'

這是我的代碼的一部分:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline


# importing dataset

dataset=pd.read_csv('Churn_Modelling.csv')
X = dataset.iloc[:,3:12].values
Y = dataset.iloc[:,13].values

#spliting dataset into test set and train set

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.20)

from sklearn.ensemble import RandomForestRegressor

regressor = RandomForestRegressor(n_estimators=20, random_state=0)  
regressor.fit(X_train, y_train) 

#feature importance

feature_importances = pd.DataFrame(rf.feature_importances_,index = X_train.columns,columns=['importance']).sort_values('importance',ascending=False)


我希望這應該為我的數據集的每一列提供特征重要性分數。 (注:原始數據為CSV格式)

所以X_train從出來train_test_split實際上是一個numpy的陣列,這將永遠不會有一個列。 其次,當你從dataset創建X時,你要求的值是返回numpy.ndarry而不是df。

你需要改變你的路線

feature_importances = pd.DataFrame(rf.feature_importances_,index = X_train.columns,columns=['importance']).sort_values('importance',ascending=False)

columns_ = dataset.iloc[:1, 3:12].columns

feature_importances = pd.DataFrame(rf.feature_importances_,index = columns_,columns=['importance']).sort_values('importance',ascending=False)

用這個:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline


# importing dataset

dataset=pd.read_csv('Churn_Modelling.csv')
X = dataset.iloc[:,3:12].values
Y = dataset.iloc[:,13].values

#spliting dataset into test set and train set

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.20)

from sklearn.ensemble import RandomForestRegressor

regressor = RandomForestRegressor(n_estimators=20, random_state=0)  
regressor.fit(X_train, y_train) 

#feature importance

feature_importances = pd.DataFrame(regressor.feature_importances_,index = dataset.columns,columns=['importance']).sort_values('importance',ascending=False)


iloc 和 loc 函數只能應用於 Pandas 數據幀。 您正在將它們應用於數組。 解決方案:將數組轉換為數據幀,然后應用 iloc 或 loc

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM