pandas pivot 表重命名列

Question

如何在 pandas pivot 操作后重命名具有多个级别的列？

下面是一些生成测试数据的代码：

import pandas as pd
df = pd.DataFrame({
    'c0': ['A','A','B','C'],
    'c01': ['A','A1','B','C'],
    'c02': ['b','b','d','c'],
    'v1': [1, 3,4,5],
    'v2': [1, 3,4,5]})

print(df)

给出一个测试 dataframe：

   c0 c01 c02  v1  v2
0  A   A   b   1   1
1  A  A1   b   3   3
2  B   B   d   4   4
3  C   C   c   5   5

应用 pivot

df2 = pd.pivot_table(df, index=["c0"], columns=["c01","c02"], values=["v1","v2"])
df2 = df2.reset_index()

给

如何通过加入级别重命名列？ 格式为<c01 value>_<c02 value>_<v1>

例如第一列应该看起来像"A_b_v1"

加入级别的顺序对我来说并不重要。

Answer 1

如果要将多索引合并为单个字符串索引而不关心索引级别顺序，则可以简单地在列上map join函数，并将结果列表分配回：

df2.columns = list(map("_".join, df2.columns))

对于您的问题，您可以遍历每个元素都是元组的列，解压缩元组并按照您想要的顺序将它们连接起来：

df2 = pd.pivot_table(df, index=["c0"], columns=["c01","c02"], values=["v1","v2"])

# Use the list comprehension to make a list of new column names and assign it back
# to the DataFrame columns attribute.
df2.columns = ["_".join((j,k,i)) for i,j,k in df2.columns]
df2.reset_index()

Answer 2

如果某些列名不是字符串，您可以map将列名转换为字符串并将它们join起来。

df2.columns = ['_'.join(map(str, c)).strip('_') for c in df2]

如果要将重命名方法链接到pivot_table方法以将其放入管道中，则可以使用pipe和set_axis来实现。

此外，您还可以使用reorder_levels重新排序列级别，例如<c01 value>_<c02 value>_<v1>而不是<v1>_<c01 value>_<c02 value>

df2 = (
    df.pivot_table(index=["c0"], columns=["c01","c02"], values=['1','2'])
    .reorder_levels([1,2,0], axis=1)                # makes "v1","v2" the last level
    .pipe(lambda x: x.set_axis(
        map('_'.join, x)                            # if all column names are strings
        #('_'.join(map(str, c)) for c in x)         # if some column names are not strings
        , axis=1)
         )                                          # rename columns
    .reset_index()
)

Answer 3

嗨，您是如何应用通用解决方案的？ [' '.join(str(s).strip() for s in col if s) for col in data.columns] 我试图设置 df.columns = [' '.join(str(s).strip() for s in col if s) for col in data.columns]，但它不起作用。 谢谢

pandas pivot 表重命名列

问题描述

2 个解决方案

解决方案1
12 已采纳 2017-02-07 20:15:21

解决方案2
0 2022-08-12 07:17:07

解决方案3
-1 2022-01-11 01:59:09

pandas pivot 表重命名列

问题描述

2 个解决方案

解决方案1 12 已采纳 2017-02-07 20:15:21

解决方案2 0 2022-08-12 07:17:07

解决方案3 -1 2022-01-11 01:59:09

解决方案1
12 已采纳 2017-02-07 20:15:21

解决方案2
0 2022-08-12 07:17:07

解决方案3
-1 2022-01-11 01:59:09