简体   繁体   中英

how to create a column hierarchical index in pandas data frame

I have two data frames which look like following

Datframe A:

dataDate  name prediction       
2018-09-30  A   2.30
2018-10-01  A   1.51
2018-10-02  A   2.08
2018-10-03  A   1.82
2018-09-30  B   0.96
2018-10-01  B   6.52
2018-10-02  B   9.21
2018-10-03  B   17.43
2018-09-30  C   6.89    
2018-10-01  C   6.10
2018-10-02  C   5.53
2018-10-03  C   1.91

I want to transform my data frame while creating a hierarchical index in the columns so that I can access a couple of columns at the same time. So, for example:-

Dataframe B:

dataDate   prediction       
           Group pred
2018-09-30  A    2.30
2018-10-01  A    1.51
2018-10-02  A    2.08
2018-10-03  A    1.82
2018-09-30  B    0.96
2018-10-01  B    6.52
2018-10-02  B    9.21
2018-10-03  B    17.43
2018-09-30  C    6.89    
2018-10-01  C    6.10
2018-10-02  C    5.53
2018-10-03  C    1.91

Data frame B has just two columns 'dataDate' and 'prediction' and prediction has then 'level 1' two columns 'Group' and 'pred'. Now I can access them by just prediction.

Please help me to transform Dataframe B from Dataframe A and vice-versa with pandas?

Use set_index for first column to DatatimeIndex if necessary and then assign new MultiIndex by MultiIndex.from_product :

df = df.set_index('dataDate')
df.columns = pd.MultiIndex.from_product([['prediction'],['Group','pred']])
#alternative
#df.columns = [['prediction'] * len(df.columns),['Group','pred']]
print (df)
           prediction       
                Group   pred
dataDate                    
2018-09-30          A   2.30
2018-10-01          A   1.51
2018-10-02          A   2.08
2018-10-03          A   1.82
2018-09-30          B   0.96
2018-10-01          B   6.52
2018-10-02          B   9.21
2018-10-03          B  17.43
2018-09-30          C   6.89
2018-10-01          C   6.10
2018-10-02          C   5.53
2018-10-03          C   1.91

For converting I suggest not change columns names in first level, only define new level Group :

df1 = df.set_index('dataDate')
df1.columns = pd.MultiIndex.from_product([['Group'],df1.columns])
print (df1)
           Group           
            name prediction
dataDate                   
2018-09-30     A       2.30
2018-10-01     A       1.51
2018-10-02     A       2.08
2018-10-03     A       1.82
2018-09-30     B       0.96
2018-10-01     B       6.52
2018-10-02     B       9.21
2018-10-03     B      17.43
2018-09-30     C       6.89
2018-10-01     C       6.10
2018-10-02     C       5.53
2018-10-03     C       1.91

df = df1['Group'].copy()
print (df)
           name  prediction
dataDate                   
2018-09-30    A        2.30
2018-10-01    A        1.51
2018-10-02    A        2.08
2018-10-03    A        1.82
2018-09-30    B        0.96
2018-10-01    B        6.52
2018-10-02    B        9.21
2018-10-03    B       17.43
2018-09-30    C        6.89
2018-10-01    C        6.10
2018-10-02    C        5.53
2018-10-03    C        1.91

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM