简体   繁体   中英

Accessing columns with MultiIndex after using pandas groupby and aggregate

I am using the df.groupby() method:

g1 = df[['md', 'agd', 'hgd']].groupby(['md']).agg(['mean', 'count', 'std'])

It produces exactly what I want!

         agd                       hgd                
        mean count       std      mean count       std
md                                                    
-4  1.398350     2  0.456494 -0.418442     2  0.774611
-3 -0.281814    10  1.314223 -0.317675    10  1.161368
-2 -0.341940    38  0.882749  0.136395    38  1.240308
-1 -0.137268   125  1.162081 -0.103710   125  1.208362
 0 -0.018731   603  1.108109 -0.059108   603  1.252989
 1 -0.034113   178  1.128363 -0.042781   178  1.197477
 2  0.118068    43  1.107974  0.383795    43  1.225388
 3  0.452802    18  0.805491 -0.335087    18  1.120520
 4  0.304824     1       NaN -1.052011     1       NaN

However, I now want to access the groupby object columns like a "normal" dataframe.

I will then be able to: 1) calculate the errors on the agd and hgd means 2) make scatter plots on md (x axis) vs agd mean ( hgd mean ) with appropriate error bars added.

Is this possible? Perhaps by playing with the indexing?

1) You can rename the columns and proceed as normal (will get rid of the multi-indexing)

g1.columns = ['agd_mean', 'agd_std','hgd_mean','hgd_std']

2) You can keep multi-indexing and use both levels in turn ( docs )

g1['agd']['mean count']

It is possible to do what you are searching for and it is called transform . You will find an example that does exactly what you are searching for in the pandas documentation here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM