[英]pandas get_dummies with identical/ same column names
I have我有
In [122]: d=pandas.DataFrame({'d_1':['a','x'],'d_2':['x','y']})
In [123]: d
Out[123]:
d_1 d_2
0 a x
1 x y
I want:我想要:
a x y
0 1 1 0
1 0 1 1
I do not want to use我不想使用
In [139]: pandas.get_dummies(d)
Out[139]:
d_1_a d_1_x d_2_x d_2_y
0 1.0 0.0 1.0 0.0
1 0.0 1.0 0.0 1.0
Because d_1_x and d_2_x are considered distinct by this function, which requires too much memory for my application.因为此函数认为 d_1_x 和 d_2_x 是不同的,这对我的应用程序来说需要太多内存。
I do however want to use get_dummies because it is fast;但是我确实想使用 get_dummies 因为它很快; so, I tried to rename the columns and apply get_dummies所以,我尝试重命名列并应用 get_dummies
In [124]: d.columns=['d' for el in d.columns]
In [141]: d
Out[141]:
d d
0 a x
1 x y
In [151]: pandas.get_dummies(d)
Out[151]:
d_('d',) d_('d',)
0 1.0 1.0
1 1.0 1.0
You can try something like this:你可以尝试这样的事情:
import pandas as pd
d.apply(lambda x: pd.Series(1, x), 1).fillna(0)
# a x y
#0 1.0 1.0 0.0
#1 0.0 1.0 1.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.