简体   繁体   English

排序数据透视表(多索引)

[英]Sorting pivot table (multi index)

I'm trying to sort a pivot table's values in descending order after putting two "row labels" (Excel term) on the pivot. 我正在尝试在数据透视表上放置两个“行标签”(Excel术语)后按降序对数据透视表的值进行排序。

sample data: 样本数据:

x = pd.DataFrame({'col1':['a','a','b','c','c', 'a','b','c', 'a','b','c'],
                  'col2':[  1,  1,  1,  1,  1,   2,  2,  2,   3,  3,  3],
                  'col3':[  1,.67,0.5,  2,.65, .75,2.25,2.5, .5,  2,2.75]})
print(x)
   col1  col2  col3
0     a     1  1.00
1     a     1  0.67
2     b     1  0.50
3     c     1  2.00
4     c     1  0.65
5     a     2  0.75
6     b     2  2.25
7     c     2  2.50
8     a     3  0.50
9     b     3  2.00
10    c     3  2.75

To create the pivot, I'm using the following function: 要创建数据透视表,我使用以下函数:

pt = pd.pivot_table(x, index = ['col1', 'col2'], values = 'col3', aggfunc = np.sum)
print(pt)
           col3
col1 col2      
a    1     1.67
     2     0.75
     3     0.50
b    1     0.50
     2     2.25
     3     2.00
c    1     2.65
     2     2.50
     3     2.75

In words, this variable pt is first sorted by col1 , then by values of col2 within col1 then by col3 within all of those. 在的话,这个变量pt首先被排序col1 ,然后通过值col2col1然后通过col3之内的所有的那些。 This is great, but I would like to sort by col3 (the values) while keeping the groups that were broken out in col2 (this column can be any order and shuffled around). 这很好,但我想按col3 (值)排序,同时保持在col2分组的组(此列可以是任何顺序并且随机排列)。

The target output would look something like this ( col3 in descending order with any order in col2 with that group of col1 ): 目标输出会是这个样子( col3中以任何顺序降序排列col2与该组的col1 ):

                   col3
    col1   col2    
     a       1     1.67
             2     0.75
             3     0.50

     b       2     2.25
             3     2.00 
             1     0.50

     c       3     2.75
             1     2.65
             2     2.50 

I have tried the code below, but this just sorts the entire pivot table values and loses the grouping (I'm looking for sorting within the group). 我已经尝试了下面的代码,但这只是对整个数据透视表值进行排序并丢失分组(我正在寻找组内的排序)。

    pt.sort_values(by = 'col3', ascending = False)

For guidance, a similar question was asked (and answered) here, but I was unable to get a successful output with the provided output: 为了获得指导,这里提出了(并回答了)类似的问题,但是我无法使用提供的输出获得成功的输出:

Pandas: Sort pivot table 熊猫:排序枢轴表

The error I get from that answer is ValueError: all keys need to be the same shape 我从该答案中得到的错误是ValueError: all keys need to be the same shape

You need reset_index for DataFrame , then sort_values by col1 and col3 and last set_index for MultiIndex : 你需要reset_indexDataFrame ,然后sort_values通过col1col3和最后set_indexMultiIndex

df = df.reset_index()
       .sort_values(['col1','col3'], ascending=[True, False])
       .set_index(['col1','col2'])

print (df)
           col3
col1 col2      
a    1     1.67
     2     0.75
     3     0.50
b    2     2.25
     3     2.00
     1     0.50
c    3     2.75
     1     2.65
     2     2.50

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM