如何遍歷pandas數據框列並基於條件分解？

Question

我正在嘗試遍歷Dataframe並有條件地分解數據。 我有一個包含房價信息的數據框，而不是用字符串表示的數據，我希望它們是類別並用數字表示（例如，豪宅= 0，房屋= 1）。 但是，有些列已經是整數或浮點數，因此我只想對字符串列進行分類。

我正在嘗試分解數據，以便可以將其與keras順序神經網絡一起使用，而無需手動檢查每一列並分解自己。

columns = list(dataframe)
for i in columns:
    if type(i)==str:
        xtrain.i = pd.Categorical(pd.factorize(dataframe.i)[0])

我以為這會分解數據，但出現錯誤

AttributeError: 'DataFrame' object has no attribute 'i'而pandas無法識別我正在嘗試引用列選擇。 作為參考，下面的代碼在代碼中起作用。 （MSZoning是列出的列）

xtrain.MSZoning = pd.Categorical(pd.factorize(xtrain.MSZoning)[0])

任何幫助或建議，將不勝感激！

Answer 1

這更像

for i in columns:
    if dataframe[i].dtypes=='object':
        xtrain[i] = pd.Categorical(pd.factorize(dataframe[i])[0])

並且由於您正在執行MlP，所以讓我們使用LabelEncoder

from sklearn import preprocessing
le = preprocessing.LabelEncoder()

for i in columns:
    if dataframe[i].dtypes=='object':
        dataframe[i] = le.fit_transform(dataframe[i])

如何遍歷pandas數據框列並基於條件分解？

問題描述

1 個解決方案

解決方案1
1 已采納

如何遍歷pandas數據框列並基於條件分解？

問題描述

1 個解決方案

解決方案1 1 已采納

解決方案1
1 已采納