简体   繁体   English

熊猫按多列分组

[英]pandas hierarchical group by multiple columns

I want to group by columns 'Number 3' and 'Event' and get the desired result as shown below. 我想按“数字3”和“事件”列进行分组,并获得所需的结果,如下所示。 Please note that the first column is the index. 请注意,第一列是索引。 I would like to save the desired result into a new dataframe. 我想将期望的结果保存到一个新的数据框中。

     Number1 Event        Number2  Number3
0      20    clouds        30        404
1      22    lightening    32        404
2      23    playing       33        405
3      25    clouds        35        410
4      24    sleeping      34        407
5      26    lightening    36        410
6      21    rain          31        404
7      27    rain          37        410


Derired Result:

Number3     Event          Number1   Number2
   404   0  clouds          20         30
         1  lightening      22         32
         6  rain            21         31
   405   2  playing         23         33
   410   3  clouds          25         35
         6  lightening      26         36
         7  rain            27         37
   407   4  sleeping        24         34

Need set_index : 需要set_index

df1 = df.set_index(['Number3', 'Event'])
print (df1)
                    Number1  Number2
Number3 Event                       
404     clouds           20       30
        lightening       21       31
        rain             22       32
405     playing          23       33
410     sun              24       34
420     clouds           25       35
        lightening       26       36
        rain             27       37

But if need old index too add parameter append=True and then swaplevel : 但是如果也需要旧index添加参数append=True ,然后再进行swaplevel

df1 = df.set_index(['Number3', 'Event'], append=True).swaplevel(0,1)
print (df1)
                      Number1  Number2
Number3   Event                       
404     0 clouds           20       30
        1 lightening       21       31
        2 rain             22       32
405     3 playing          23       33
410     4 sun              24       34
420     5 clouds           25       35
        6 lightening       26       36
        7 rain             27       37

EDIT by edited question: 通过修改后的问题进行编辑:

Add sort_index : 添加sort_index

df1 = df.set_index(['Number3', 'Event'], append=True)
        .swaplevel(0,1)
        .sort_index(level='Number3')
print (df1)
                      Number1  Number2
Number3   Event                       
404     0 clouds           20       30
        1 lightening       22       32
        6 rain             21       31
405     2 playing          23       33
407     4 sleeping         24       34
410     3 clouds           25       35
        5 lightening       26       36
        7 rain             27       37

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM