简体   繁体   中英

Convert excel multi header data in a pandas dataframe

My data in excel looks like: 在此处输入图像描述

and I want to convert it in a pandas dataframe like: 在此处输入图像描述

Create MulitIndex in columns and in index:

df = pd.read_excel(file, index_cols=[0,1,2], header=[0,1])

#verify MultiIndex
print (df.index)
MultiIndex([(    'Pencil', 'A01',   'Red'),
            (       'Pen', 'A02',  'Blue'),
            ('Toothbrush', 'B01', 'Green')],
           names=['ProductName', 'ProductCode', 'Color'])

#verify MultiIndex
print (df.columns)
MultiIndex([('Jan-22', 'Supplier1'),
            ('Jan-22', 'Supplier2'),
            ('Jan-22',     'Total'),
            ('Feb-22',  'Supplie1'),
            ('Feb-22', 'Supplier2'),
            ('Feb-22',     'Total')],
           )

Then reshape by DataFrame.stack with remove columns Total :

df = (df.drop('Total', axis=1, level=1)
        .rename_axis(['Date','SupplierName'], axis=1)
        .stack([0,1])
        .reset_index(name='Volumes'))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM