简体   繁体   中英

How to create a customized multi-index with different sub column headings using pandas in a dataframe

I have a dataset that contains multi-index columns with the first level consisting of a year divided into four quarters. How do I structure the index so as to have 4 sets of months under each quarter?

I found the following piece of code on stack overflow:

index = pd.MultiIndex.from_product([['S1', 'S2'], ['Start', 'Stop']])
print pd.DataFrame([pd.DataFrame(dic).unstack().values], columns=index)

that gave the following output:

           S1                      S2            
        Start        Stop       Start        Stop
0  2013-11-12  2013-11-13  2013-11-15  2013-11-17

However, it couldn't solve my requirement of having different sets of months under each quarter of the year.

My data looks like this:

                                     2015
              Q1                   Q2              Q3               Q4
Country  jan   Feb   March     Apr May Jun    July Aug Sep     Oct Nov Dec

India    45    54    34        34  45   45    43   45  67      45  56   56
Canada   44    34    12        32  35   45    43   41  60      43  55   21

I wish to input the same structure of the dataset into pandas with the specific set of months under each quarter. How should I go about this?

You can also create a MultiIndex in a few other ways. One of these, which is useful if you have a complicated structure, is to construct it from an explicit set of tuples where each tuple is one hierarchical column. Below I first create all of the tuples that you need of the form (year, quarter, month) , make a MultiIndex from these, then assign that as the columns of the dataframe.

import pandas as pd

year = 2015
months = [
    ("Jan", "Feb", "Mar"),
    ("Apr", "May", "Jun"),
    ("Jul", "Aug", "Sep"),
    ("Oct", "Nov", "Dec"),
]
tuples = [(year, f"Q{i + 1}", month) for i in range(4) for month in months[i]]
multi_index = pd.MultiIndex.from_tuples(tuples)
data = [
    [45, 54, 34, 34, 45, 45, 43, 45, 67, 45, 56, 56],
    [44, 34, 12, 32, 35, 45, 43, 41, 60, 43, 55, 21],   
]
df = pd.DataFrame(data, index=["India", "Canada"], columns=multi_index)
df
#                                                  2015
#                Q1          Q2          Q3          Q4
#        Jan FebMar Apr May Jun Jul Aug Sep Oct Nov Dec
# India  45  54 34  34  45  45  43  45  67  45  56  56
# Canada 44  34 12  32  35  45  43  41  60  43  55  21

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM