简体   繁体   English

Pandas:将列拆分为具有唯一值的多列

[英]Pandas: split column into multiple columns with unique values

Say I have the following dataframe:假设我有以下数据框:

   A
0  Me
1  Myself
2  and
3  Irene
4  Me, Myself, and Irene

which needs to be turned into:这需要变成:

   Me  Myself  and  Irene
0  1   0       0    0
1  0   1       0    0
2  0   0       1    0
3  0   0       0    1
4  1   1       1    1

Looking for any suggestion.寻找任何建议。

You can use get_dummies with reindex by all possible categories:您可以通过所有可能的类别使用get_dummiesreindex

df1 = pd.DataFrame({'A': ['Me', 'Myself', 'and', 'Irene']})
df2= pd.DataFrame({'A': ['Me', 'Myself', 'and']})
df3 = pd.DataFrame({'A': ['Me', 'Myself', 'or', 'Irene']})

all_categories = pd.concat([df1.A, df2.A, df3.A]).unique()
print (all_categories)
['Me' 'Myself' 'and' 'Irene' 'or']

df1 = pd.get_dummies(df1.A).reindex(columns=all_categories, fill_value=0)
print(df1)
   Me  Myself  and  Irene  or
0   1       0    0      0   0
1   0       1    0      0   0
2   0       0    1      0   0
3   0       0    0      1   0

df2 = pd.get_dummies(df2.A).reindex(columns=all_categories, fill_value=0)
print(df2)
   Me  Myself  and  Irene  or
0   1       0    0      0   0
1   0       1    0      0   0
2   0       0    1      0   0

df3 = pd.get_dummies(df3.A).reindex(columns=all_categories, fill_value=0)
print(df3)
   Me  Myself  and  Irene  or
0   1       0    0      0   0
1   0       1    0      0   0
2   0       0    0      0   1
3   0       0    0      1   0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM