庞大数据帧的简单数据透视表

Question

I'm trying to do a seemingly very simple task. 我正在尝试做一个看似非常简单的任务。 Given a dataframe: 给定一个数据帧：

daf = pd.DataFrame({'co':['g','r','b','r','g','r','b','g'], 'sh':['c','s','r','r','r','s','c','r']}) daf = pd.DataFrame（{'co'：['g'，'r'，'b'，'r'，'g'，'r'，'b'，'g']，'sh'：[ 'C'， 'S'， 'R'， 'R'， 'R'， 'S'， 'C'， 'R']}）
  co sh 0 gc 1 rs 2 br 3 rr 4 gr 5 rs 6 bc 7 gr 

I'd like to count the number of records with the unique combination of 'co' and 'sh' values and output as a table with rows ['g','r','b'] and columns ['c','s','r'] 我想用'co'和'sh'值的唯一组合来计算记录的数量，并输出为包含行['g'，'r'，'b']和列['c'的表格， 'S'， 'R']

  csr g 1 0 2 r 0 1 1 b 1 0 1

Can it be done using pivot_table? 可以使用pivot_table完成吗？

Thank you, 谢谢，

Answer 1

It can be done more simply using pandas.crosstab : 可以使用pandas.crosstab更简单地完成它：

>>> pandas.crosstab(d.co, d.sh)
sh  c  r  s
co         
b   1  1  0
g   1  2  0
r   0  1  2

You can do it with pivot_table , but it will give you NaN instead of 0 for missing combos. 您可以使用pivot_table执行此pivot_table ，但是对于缺少组合，它将为您提供NaN而不是0。 You need to specify len as the aggregating function: 您需要指定len作为聚合函数：

>>> d.pivot_table(index='co', columns='sh', aggfunc=len)
sh   c  r   s
co           
b    1  1 NaN
g    1  2 NaN
r  NaN  1   2

庞大数据帧的简单数据透视表

问题描述

1 个解决方案

解决方案1
5 2015-01-02 06:45:08

庞大数据帧的简单数据透视表

问题描述

1 个解决方案

解决方案1 5 2015-01-02 06:45:08

解决方案1
5 2015-01-02 06:45:08