[英]simple pivot table of pandas dataframe
I'm trying to do a seemingly very simple task. 我正在尝试做一个看似非常简单的任务。 Given a dataframe: 给定一个数据帧:
daf = pd.DataFrame({'co':['g','r','b','r','g','r','b','g'], 'sh':['c','s','r','r','r','s','c','r']}) daf = pd.DataFrame({'co':['g','r','b','r','g','r','b','g'],'sh':[ 'C', 'S', 'R', 'R', 'R', 'S', 'C', 'R']})
co sh 0 gc 1 rs 2 br 3 rr 4 gr 5 rs 6 bc 7 gr
I'd like to count the number of records with the unique combination of 'co' and 'sh' values and output as a table with rows ['g','r','b'] and columns ['c','s','r'] 我想用'co'和'sh'值的唯一组合来计算记录的数量,并输出为包含行['g','r','b']和列['c'的表格, 'S', 'R']
csr g 1 0 2 r 0 1 1 b 1 0 1
Can it be done using pivot_table? 可以使用pivot_table完成吗?
Thank you, 谢谢,
It can be done more simply using pandas.crosstab
: 可以使用pandas.crosstab
更简单地完成它:
>>> pandas.crosstab(d.co, d.sh)
sh c r s
co
b 1 1 0
g 1 2 0
r 0 1 2
You can do it with pivot_table
, but it will give you NaN instead of 0 for missing combos. 您可以使用pivot_table
执行此pivot_table
,但是对于缺少组合,它将为您提供NaN而不是0。 You need to specify len
as the aggregating function: 您需要指定len
作为聚合函数:
>>> d.pivot_table(index='co', columns='sh', aggfunc=len)
sh c r s
co
b 1 1 NaN
g 1 2 NaN
r NaN 1 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.