简体   繁体   English

python pandas:在多索引数据框中重命名单列标签

[英]python pandas: rename single column label in multi-index dataframe

I have a df that looks like this: 我有一个看起来像这样的df:

df = pd.DataFrame(np.random.random((4,4)))
df.columns = pd.MultiIndex.from_product([['1','2'],['A','B']])
print df
          1                   2          
          A         B         A         B
0  0.030626  0.494912  0.364742  0.320088
1  0.178368  0.857469  0.628677  0.705226
2  0.886296  0.833130  0.495135  0.246427
3  0.391352  0.128498  0.162211  0.011254

How can I rename column '1' and '2' as 'One' and 'Two'? 如何将列“ 1”和“ 2”重命名为“一”和“二”?

I thought df.rename() would've helped but it doesn't. 我以为df.rename()会有所帮助,但没有帮助。 Have no idea how to do this? 不知道该怎么做?

That is indeed something missing in rename (ideally it should let you specify the level). 这确实是rename缺少的东西(理想情况下,应该让您指定级别)。
Another way is by setting the levels of the columns index, but then you need to know all values for that level: 另一种方法是通过设置列索引的级别,但是然后您需要知道该级别的所有值:

In [41]: df.columns.levels[0]
Out[41]: Index([u'1', u'2'], dtype='object')

In [43]: df.columns = df.columns.set_levels(['one', 'two'], level=0)

In [44]: df
Out[44]:
        one                 two
          A         B         A         B
0  0.899686  0.466577  0.867268  0.064329
1  0.162480  0.455039  0.736870  0.759595
2  0.620960  0.922119  0.060141  0.669997
3  0.871107  0.043799  0.080080  0.577421

In [45]: df.columns.levels[0]
Out[45]: Index([u'one', u'two'], dtype='object')

Use set_levels : 使用set_levels

>>> df.columns.set_levels(['one','two'], 0, inplace=True)
>>> print(df)
        one                 two          
          A         B         A         B
0  0.731851  0.489611  0.636441  0.774818
1  0.996034  0.298914  0.377097  0.404644
2  0.217106  0.808459  0.588594  0.009408
3  0.851270  0.799914  0.328863  0.009914
df.columns.set_levels(['one', 'two'], level=0, inplace=True)

df.rename_axis({'1':'one', '2':'two'}, axis='columns', inplace=True)

As of pandas 0.22.0 (and probably much earlier), you can specify the level: 从大熊猫0.22.0开始(可能更早),您可以指定级别:

df = df.rename(columns={'1': one, '2': two}, level=0)

or, alternatively (new notation since pandas 0.21.0): 或者,(自熊猫0.21.0起的新符号):

df = df.rename({'1': one, '2': two}, axis='columns', level=0)

But actually, it works even when omitting the level: 但是实际上,即使省略该级别,它也可以工作:

df = df.rename(columns={'1': one, '2': two})

In that case, all column levels are checked for occurrences to be renamed. 在这种情况下,将检查所有列级别以查找要重命名的事件。

This is a good question. 这是一个很好的问题。 Combining the answer above, you can write a function: 结合以上答案,您可以编写一个函数:

def rename_col( df, columns, level = 0 ):

    def rename_apply ( x, rename_dict ):
        try:
            return rename_dict[x]
        except KeyError:
            return x

    if  isinstance(df.columns, pd.core.index.MultiIndex):
        df.columns = df.columns.set_levels([rename_apply(x, rename_dict = columns ) for x in df.columns.levels[level]], level= level)
    else:
        df.columns =                       [rename_apply(x, rename_dict = columns ) for x in df.columns              ] 

    return df

It worked for me. 它为我工作。

Ideally, a functionality like this should be integrated into the "official" "rename" function in the future, so you don't need to write a hack like this. 理想情况下,将来应将此类功能集成到“官方”“重命名”功能中,因此您无需编写此类hack。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM