I'm new to pandas and I have a DataFrame of this kind :
name value
0 alpha a
1 beta b
2 gamma c
3 alpha a
4 beta b
5 beta a
6 gamma a
7 alpha c
which I would like to turn into one of this kind :
name a b c
0 alpha 2 0 1
1 beta 1 2 0
2 gamma 1 0 1
That is to say I would like to group by "name" and "value", then count them, and create a column for each value of "value" I find.
It is just a cross tabulation:
In [78]:
print pd.crosstab(df.name, df.value)
value a b c
name
alpha 2 0 1
beta 1 2 0
gamma 1 0 1
If you use groupby
:
In [90]:
print df.groupby(['name', 'value']).agg(len).unstack().fillna(0)
value a b c
name
alpha 2 0 1
beta 1 2 0
gamma 1 0 1
The latter might be faster:
In [92]:
%timeit df.groupby(['name', 'value']).agg(len).unstack().fillna(0)
100 loops, best of 3: 3.26 ms per loop
In [93]:
%timeit pd.crosstab(df.name, df.value)
100 loops, best of 3: 7.5 ms per loop
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.