I'm trying to map values (Weight of Evidence) from a dataset to another.
My original dataset looks as follows:
The aim is to substitute these rows with the corresponding weight of evidence values.
I would do it manually, for example:
df.loc[df['loan_type'] == 1, 'loan_type'] = 0.008241
I'm looking for a better way to do it (not manually) Similar to this post: Python Dataframe: Update the values of a column in a dataframe based on another dataframe
This should do the trick:
df = pd.DataFrame({'loan_type':[1,2,1,3], 'var_1':[1,0,1,2],
'var_2':[1,1,1,1],
'var_3':[1,1,1,1],
'var_4':[1,0,1,0],
'target':[0, 0, 0,0 ]})
woe_df = pd.DataFrame({'variable':['loan_type', 'loan_type', 'loan_type', 'var_1', 'var_1', 'var_1', 'var_2', 'var_3', 'var_4', 'var_4'],
'modality':[1,2,3, 0,1,2, 1, 1, 0, 1],
'woe':[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]})
woe_mapping = defaultdict(dict)
for variable in woe_df.variable.unique():
var_woes = woe_df[woe_df.variable==variable]
for modality, woe in zip(var_woes.modality, var_woes.woe):
woe_mapping[variable][modality] = woe
else:
df[variable] = df[variable].map(woe_mapping[variable])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.