简体   繁体   中英

Multiindex in pandas pivot table

I am working on a pivot table that looks like this:

            Style  Site AVS  End Qty.                                          \
JP SIZE                           116  120  140  ADULTS  L  M  O  OSFA  S  XL   
0        50935801  2664   0         0    0    0       0  0  0  0     0  0   3   
1        50935801  2807   0         0    0    0       0  0  0  0     0  0   3   
2        50935801  2832   0         0    0    0       0  0  0  0     0  0   3   
3        50935802  2702   1         0    0    0       0  0  1  0     0  0   0   
4        50985101  2849   0         0    0    0       0  0  3  0     0  0   0   

            Sales Qty.                              
JP SIZE  Total         116  120  140  ADULTS  L  M      
0            3           0    0    0       0  0  0 ...  
1            3           0    0    0       0  0  0 ...  
2            3           0    0    0       0  0  0 ...  
3            1           0    0    0       0  0 -1 ...  
4            3           0    0    0       0  0  0 ...  

And I would like to have only one vector of column headers that would be [Style, Site, AVS, 116, 120 , ... , Total , Sales Qty.]

But for the "Sales Qty." column, instead of the table that is there at the moment I would like only the total column (I can access it with jj['Sales Qty']['Total'] for the moment, so I guess I could save it in another variable, delete it and add it in the end)

Everything that I have tried so far has failed, I think it is because I don't understand very well how MultiIndex work yet.

Thanks in advance for any help you can provide on that!

There might be something more clever built in but one way is to work with MultiIndex as a list of tuples, and map out the new column names as you described.

def custom_rename(lvl1, lvl2):
    if lvl1 == 'End Qty.':
        return lvl2
    elif lvl1 == 'Sales Qty.' and lvl2 == 'Total':
        return 'Sales Qty.'
    elif lvl2 == '':
        return lvl1
    else:
        return '_'

Then apply to the columns and assign:

df.columns = [custom_rename(lvl1, lvl2) for lvl1, lvl2 in df.columns]

'_' above was used a marker for the columns no longer desired, so the last step would be to drop those.

df = df.drop('_', axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM