简体   繁体   中英

pivot table(?) with a Pandas Dataframe

I have a DataFrame that is something similar to this

id    name    value
a     Adam    5
b     Eve     6
c     Adam    4
a     Eve     3
d     Seth    2
b     Adam    4
a     Adam    2

I am trying to see how many id s are associated with how many names and the overlap between them. I did a groupby on the id column and then I could see how many id's have how many names associated with them.

df.groupby('id')['name'].nunique().value_counts()

What I would now like is a way to get a table where the names are the column names, and index is the id, and the value is the sum for each id and name. I could do it for a for loop, by initializing a DataFrame where the columns are the values in the name column but I am wondering if there is a pandas way of accomplishing something like this?

is that what you want?

In [54]: df.pivot_table(index='id', columns='name', values='value', aggfunc='sum')
Out[54]:
name  Adam  Eve  Seth
id
a      7.0  3.0   NaN
b      4.0  6.0   NaN
c      4.0  NaN   NaN
d      NaN  NaN   2.0

or without NaN's:

In [56]: df.pivot_table(index='id', columns='name', values='value', aggfunc='sum', fill_value=0)
Out[56]:
name  Adam  Eve  Seth
id
a        7    3     0
b        4    6     0
c        4    0     0
d        0    0     2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM