[英]pivot table(?) with a Pandas Dataframe
I have a DataFrame that is something similar to this 我有一个类似于此的DataFrame
id name value
a Adam 5
b Eve 6
c Adam 4
a Eve 3
d Seth 2
b Adam 4
a Adam 2
I am trying to see how many id
s are associated with how many names and the overlap between them. 我正在尝试查看有多少id
与多少个名称以及它们之间的重叠关系。 I did a groupby on the id column and then I could see how many id's have how many names associated with them. 我在id列上进行了一个groupby,然后可以看到有多少id与它们相关联的名字。
df.groupby('id')['name'].nunique().value_counts()
What I would now like is a way to get a table where the names are the column names, and index is the id, and the value is the sum for each id and name. 我现在想要的是一种获取表的方法,其中名称是列名,索引是ID,值是每个ID和名称的总和。 I could do it for a for loop, by initializing a DataFrame where the columns are the values in the name column but I am wondering if there is a pandas way of accomplishing something like this? 我可以通过初始化一个DataFrame来实现for循环,在此DataFrame中,列是name列中的值,但是我想知道是否有熊猫方式来完成这样的事情?
is that what you want? 那是你要的吗?
In [54]: df.pivot_table(index='id', columns='name', values='value', aggfunc='sum')
Out[54]:
name Adam Eve Seth
id
a 7.0 3.0 NaN
b 4.0 6.0 NaN
c 4.0 NaN NaN
d NaN NaN 2.0
or without NaN's: 或没有NaN:
In [56]: df.pivot_table(index='id', columns='name', values='value', aggfunc='sum', fill_value=0)
Out[56]:
name Adam Eve Seth
id
a 7 3 0
b 4 6 0
c 4 0 0
d 0 0 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.