简体   繁体   English

如何在python的循环中创建虚拟变量?

[英]How to create dummy variables within a loop in python?

So I have a dataframe with a bunch of eatures, some of which I want to make into a dummy variable, some of which I want to leave alone, and I wanted to create a lazy/faster way to do this rather than just typing: 因此,我有一个具有许多特征的数据框,其中一些我想放入一个虚拟变量中,其中一些我想保留下来,我想创建一种懒惰/更快的方法来实现此目的,而不仅仅是输入:

dum_A = pd.get_dummies(df['A'],prefix='A')
dum_B = pd.get_dummies(df['B'],prefix='B')
...
dum_N = pd.get_dummies(df['N'],prefix='N')

So this is the code I came up with below. 这是我在下面想到的代码。

List_of_dummy_names = []
List_of_dummy_col = []

for col in list(df1.columns.values):
     if len(df1[col].value_counts()) <= 7:
        List_of_dummy_names.append('dum_'+col)
        List_of_dummy_col.append(col)

for (dummy, col) in zip(List_of_dummy_names, List_of_dummy_col):
    dummy = pd.get_dummies(df1[col], prefix=col)

But this only returns the variable dummy being a dummy dataframe of the nth feature in the lists. 但是,这仅返回变量dummy作为列表​​中第n个特征的哑数据帧。 What am I doing wrong here? 我在这里做错了什么? I thought for each loop its getting a new name from the list, instead it looks like its assinging the new dummy DF each time to the variable dummy. 我认为对于每个循环,它都会从列表中获得一个新名称,相反,它看起来像是每次将新的哑元DF赋给变量哑元。

Many thanks in advance guys. 非常感谢大家。

for col in list(df.columns.values):
     if len(df[col].value_counts()) <= 7:
            df= pd.concat([df,pd.get_dummies(df[col],prefix=col)],axis=0)
            df[col].fillna(0,inplace=True)
        `

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM