如何遍历pandas数据框列并基于条件分解？

Question

I am trying to iterate over a Dataframe and conditionally factorize data. 我正在尝试遍历Dataframe并有条件地分解数据。 I have a Dataframe with information about house prices and instead of the data being represented by strings I would like them to be categories and represented by numbers (ie mansion = 0, house = 1). 我有一个包含房价信息的数据框，而不是用字符串表示的数据，我希望它们是类别并用数字表示（例如，豪宅= 0，房屋= 1）。 However, some columns are already integers or floats so I only want to categorize the columns that are strings. 但是，有些列已经是整数或浮点数，因此我只想对字符串列进行分类。

I am trying to factorize the data so I can use it with a keras sequential neural net without manually going through each column and factorizing myself. 我正在尝试分解数据，以便可以将其与keras顺序神经网络一起使用，而无需手动检查每一列并分解自己。

columns = list(dataframe)
for i in columns:
    if type(i)==str:
        xtrain.i = pd.Categorical(pd.factorize(dataframe.i)[0])

I thought this would factorize the data but I get the error 我以为这会分解数据，但出现错误

AttributeError: 'DataFrame' object has no attribute 'i' and pandas does not recognize that I am attempting to refer to the column selection. AttributeError: 'DataFrame' object has no attribute 'i'而pandas无法识别我正在尝试引用列选择。 for reference, the following piece of code works in the code. 作为参考，下面的代码在代码中起作用。 (MSZoning is a listed column) （MSZoning是列出的列）

xtrain.MSZoning = pd.Categorical(pd.factorize(xtrain.MSZoning)[0])

Any help or advice would be much appreciated! 任何帮助或建议，将不胜感激！

Answer 1

This is more like 这更像

for i in columns:
    if dataframe[i].dtypes=='object':
        xtrain[i] = pd.Categorical(pd.factorize(dataframe[i])[0])

And since you are doing MlP, so let us using LabelEncoder 并且由于您正在执行MlP，所以让我们使用LabelEncoder

from sklearn import preprocessing
le = preprocessing.LabelEncoder()

for i in columns:
    if dataframe[i].dtypes=='object':
        dataframe[i] = le.fit_transform(dataframe[i])

如何遍历pandas数据框列并基于条件分解？

问题描述

1 个解决方案

解决方案1
1 已采纳

如何遍历pandas数据框列并基于条件分解？

问题描述

1 个解决方案

解决方案1 1 已采纳

解决方案1
1 已采纳