简体   繁体   中英

python pandas: rename single column label in multi-index dataframe

I have a df that looks like this:

df = pd.DataFrame(np.random.random((4,4)))
df.columns = pd.MultiIndex.from_product([['1','2'],['A','B']])
print df
          1                   2          
          A         B         A         B
0  0.030626  0.494912  0.364742  0.320088
1  0.178368  0.857469  0.628677  0.705226
2  0.886296  0.833130  0.495135  0.246427
3  0.391352  0.128498  0.162211  0.011254

How can I rename column '1' and '2' as 'One' and 'Two'?

I thought df.rename() would've helped but it doesn't. Have no idea how to do this?

That is indeed something missing in rename (ideally it should let you specify the level).
Another way is by setting the levels of the columns index, but then you need to know all values for that level:

In [41]: df.columns.levels[0]
Out[41]: Index([u'1', u'2'], dtype='object')

In [43]: df.columns = df.columns.set_levels(['one', 'two'], level=0)

In [44]: df
Out[44]:
        one                 two
          A         B         A         B
0  0.899686  0.466577  0.867268  0.064329
1  0.162480  0.455039  0.736870  0.759595
2  0.620960  0.922119  0.060141  0.669997
3  0.871107  0.043799  0.080080  0.577421

In [45]: df.columns.levels[0]
Out[45]: Index([u'one', u'two'], dtype='object')

Use set_levels :

>>> df.columns.set_levels(['one','two'], 0, inplace=True)
>>> print(df)
        one                 two          
          A         B         A         B
0  0.731851  0.489611  0.636441  0.774818
1  0.996034  0.298914  0.377097  0.404644
2  0.217106  0.808459  0.588594  0.009408
3  0.851270  0.799914  0.328863  0.009914
df.columns.set_levels(['one', 'two'], level=0, inplace=True)

df.rename_axis({'1':'one', '2':'two'}, axis='columns', inplace=True)

As of pandas 0.22.0 (and probably much earlier), you can specify the level:

df = df.rename(columns={'1': one, '2': two}, level=0)

or, alternatively (new notation since pandas 0.21.0):

df = df.rename({'1': one, '2': two}, axis='columns', level=0)

But actually, it works even when omitting the level:

df = df.rename(columns={'1': one, '2': two})

In that case, all column levels are checked for occurrences to be renamed.

This is a good question. Combining the answer above, you can write a function:

def rename_col( df, columns, level = 0 ):

    def rename_apply ( x, rename_dict ):
        try:
            return rename_dict[x]
        except KeyError:
            return x

    if  isinstance(df.columns, pd.core.index.MultiIndex):
        df.columns = df.columns.set_levels([rename_apply(x, rename_dict = columns ) for x in df.columns.levels[level]], level= level)
    else:
        df.columns =                       [rename_apply(x, rename_dict = columns ) for x in df.columns              ] 

    return df

It worked for me.

Ideally, a functionality like this should be integrated into the "official" "rename" function in the future, so you don't need to write a hack like this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM