简体   繁体   English

庞大数据帧的简单数据透视表

[英]simple pivot table of pandas dataframe

I'm trying to do a seemingly very simple task. 我正在尝试做一个看似非常简单的任务。 Given a dataframe: 给定一个数据帧:

daf = pd.DataFrame({'co':['g','r','b','r','g','r','b','g'], 'sh':['c','s','r','r','r','s','c','r']}) daf = pd.DataFrame({'co':['g','r','b','r','g','r','b','g'],'sh':[ 'C', 'S', 'R', 'R', 'R', 'S', 'C', 'R']})

  co sh 0 gc 1 rs 2 br 3 rr 4 gr 5 rs 6 bc 7 gr 

I'd like to count the number of records with the unique combination of 'co' and 'sh' values and output as a table with rows ['g','r','b'] and columns ['c','s','r'] 我想用'co'和'sh'值的唯一组合来计算记录的数量,并输出为包含行['g','r','b']和列['c'的表格, 'S', 'R']

  csr g 1 0 2 r 0 1 1 b 1 0 1 

Can it be done using pivot_table? 可以使用pivot_table完成吗?

Thank you, 谢谢,

It can be done more simply using pandas.crosstab : 可以使用pandas.crosstab更简单地完成它:

>>> pandas.crosstab(d.co, d.sh)
sh  c  r  s
co         
b   1  1  0
g   1  2  0
r   0  1  2

You can do it with pivot_table , but it will give you NaN instead of 0 for missing combos. 您可以使用pivot_table执行此pivot_table ,但是对于缺少组合,它将为您提供NaN而不是0。 You need to specify len as the aggregating function: 您需要指定len作为聚合函数:

>>> d.pivot_table(index='co', columns='sh', aggfunc=len)
sh   c  r   s
co           
b    1  1 NaN
g    1  2 NaN
r  NaN  1   2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM