[英]Creating new binary columns from single string column in pandas
I've seen this before and simply can't remember the function. 我以前见过这个,根本记不住这个功能。
Say I have a column "Speed" and each row has 1 of these values: 假设我有一个列“速度”,每行有以下1个值:
'Slow', 'Normal', 'Fast'
How do I create a new dataframe with all my rows except the column "Speed" which is now 3 columns: "Slow" "Normal" and "Fast" which has all of my rows labeled with a 1 in whichever column the old "Speed" column was. 如何创建一个包含所有行的新数据帧,除了“速度”列,现在是3列:“慢”“正常”和“快速”,其中所有行都标记为1,无论哪一列都是旧的“速度” “专栏是。 So if I had:
所以,如果我有:
print df['Speed'].ix[0]
> 'Normal'
I would not expect this: 我不指望这个:
print df['Normal'].ix[0]
>1
print df['Slow'].ix[0]
>0
You can do this easily with pd.get_dummies
( docs ): 您可以使用
pd.get_dummies
( docs )轻松完成此操作:
In [37]: df = pd.DataFrame(['Slow', 'Normal', 'Fast', 'Slow'], columns=['Speed'])
In [38]: df
Out[38]:
Speed
0 Slow
1 Normal
2 Fast
3 Slow
In [39]: pd.get_dummies(df['Speed'])
Out[39]:
Fast Normal Slow
0 0 0 1
1 0 1 0
2 1 0 0
3 0 0 1
Here is one solution: 这是一个解决方案:
df['Normal'] = df.Speed.apply(lambda x: 1 if x == "Normal" else 0)
df['Slow'] = df.Speed.apply(lambda x: 1 if x == "Slow" else 0)
df['Fast'] = df.Speed.apply(lambda x: 1 if x == "Fast" else 0)
This has another method: 这有另一种方法:
df = pd.DataFrame(['Slow','Fast','Normal','Normal'],columns=['Speed'])
df['Normal'] = np.where(df['Speed'] == 'Normal', 1 ,0)
df['Fast'] = np.where(df['Speed'] == 'Fast', 1 ,0)
df['Slow'] = np.where(df['Speed'] == 'Slow', 1 ,0)
df
Speed Normal Fast Slow
0 Slow 0 0 1
1 Fast 0 1 0
2 Normal 1 0 0
3 Normal 1 0 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.