简体   繁体   English

熊猫数据透视表中的Multiindex

[英]Multiindex in pandas pivot table

I am working on a pivot table that looks like this: 我正在处理如下所示的数据透视表:

            Style  Site AVS  End Qty.                                          \
JP SIZE                           116  120  140  ADULTS  L  M  O  OSFA  S  XL   
0        50935801  2664   0         0    0    0       0  0  0  0     0  0   3   
1        50935801  2807   0         0    0    0       0  0  0  0     0  0   3   
2        50935801  2832   0         0    0    0       0  0  0  0     0  0   3   
3        50935802  2702   1         0    0    0       0  0  1  0     0  0   0   
4        50985101  2849   0         0    0    0       0  0  3  0     0  0   0   

            Sales Qty.                              
JP SIZE  Total         116  120  140  ADULTS  L  M      
0            3           0    0    0       0  0  0 ...  
1            3           0    0    0       0  0  0 ...  
2            3           0    0    0       0  0  0 ...  
3            1           0    0    0       0  0 -1 ...  
4            3           0    0    0       0  0  0 ...  

And I would like to have only one vector of column headers that would be [Style, Site, AVS, 116, 120 , ... , Total , Sales Qty.] 而且我只希望有一个列标题向量,该向量为[Style,Site,AVS,116,120,...,Total,Sales Qty。]。

But for the "Sales Qty." 但是对于“销售数量”。 column, instead of the table that is there at the moment I would like only the total column (I can access it with jj['Sales Qty']['Total'] for the moment, so I guess I could save it in another variable, delete it and add it in the end) 列,而不是现在的表,我只希望总列(我现在可以使用jj ['Sales Qty'] ['Total']来访问它,所以我想我可以将其保存在另一个列中变量,将其删除并最后添加)

Everything that I have tried so far has failed, I think it is because I don't understand very well how MultiIndex work yet. 到目前为止,我尝试过的所有方法都失败了,我认为这是因为我对MultiIndex的工作方式还不太了解。

Thanks in advance for any help you can provide on that! 在此先感谢您提供的任何帮助!

There might be something more clever built in but one way is to work with MultiIndex as a list of tuples, and map out the new column names as you described. 可能内置了一些更聪明的方法,但是一种方法是将MultiIndex作为元组列表使用,并按照您的描述映射新的列名。

def custom_rename(lvl1, lvl2):
    if lvl1 == 'End Qty.':
        return lvl2
    elif lvl1 == 'Sales Qty.' and lvl2 == 'Total':
        return 'Sales Qty.'
    elif lvl2 == '':
        return lvl1
    else:
        return '_'

Then apply to the columns and assign: 然后应用于列并分配:

df.columns = [custom_rename(lvl1, lvl2) for lvl1, lvl2 in df.columns]

'_' above was used a marker for the columns no longer desired, so the last step would be to drop those. 上面的'_'用于不再需要的列的标记,因此最后一步是删除这些列。

df = df.drop('_', axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM