[英]Create Dummy Variables with Loop in Python
我正在尝试为包含某个单词的某些列创建一堆新的二进制变量(并且我想将这些新的二进制变量BINARY_ +column name
),我正在尝试这样做,但是它不起作用:
# create empty list
List_of_dummy_names = []
# word
string = "WORD"
for col in list(df.columns.values):
if string in df.columns.values[col]:
List_of_dummy_names.append('BINARY_'+col)
在您的情况下, col
看起来像是某种集合。 你可能想要这样做:
List_of_dummy_names.append('BINARY_'+string)
如果您想使用新创建的 List_of_dummy_names(带有 'BINARY_'+column_name 的元素)重命名 Pandas 数据框列,那么您可以按照我的回答进行操作。
让我们说
cv = list(df.columns.values)
#cv = ['aword', 'bword', 'c']
search_String = 'word'
replace_dict = dict(zip(cv,['BINARY_'+x if search_String in x else x for x in cv]))
#{'aword': 'BINARY_aword', 'bword': 'BINARY_bword', 'c': 'c'}
#Then in pandas dataframe rename method, use this dictinary
new_df = df.rename(col=replace_dict)
还要检查你是否可以使用下面的
List_of_dummy_names = ['BINARY_'+x for x in cv if search_String in x ]
#['BINARY_aword', 'BINARY_bword'] #filters the element having 'word' in them and prefixed with 'BINARY_'
检查你是否需要这个(因为我对“你在找什么”感到困惑)
#df has only one column named 'col_to_replace'
col_to_replace
aword
bword
c
df['col_to_replace'] = ['BINARY_'+x if search_String in x else x for x in df['col_to_replace']]
#col_to_replace
BINARY_aword #prefixed
BINARY_bword #prefixed
c #word not found, so as it was
现在您在列表中获得了新的列名列表。
List_of_dummy_names #['BINARY_aword', 'BINARY_bword']
#loop over it and create new columns in existing dataframe
for col_Name in List_of_dummy_names:
df[col_Name] = 'default_value_1' #it will create new column "BINARY_aword" and all the row_values as string 'default_value_1' for first loop and in 2nd loop new column "BINARY_aword" with all values as 'default_value_1'.
如果你已经在一个列表中有一个值, len(list) == len(df) 然后将该列表分配为 df[col_Name] = list_of_values_sharing_same_length_as_DF
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.