如何组合 pandas pivot 表中的列？

Question

I have a pivot table with 3 levels of columns.我有一个包含 3 级列的 pivot 表。 For each unique mean and std, I want to combine them into a str f"{x.mean}({x.std})" replacing the mean and std columns with the new mean_std_str column.对于每个唯一的均值和标准，我想将它们组合成一个 str f"{x.mean}({x.std})"用新的 mean_std_str 列替换均值和标准列。

Heres a print of the dataframe:这是 dataframe 的打印件：

rescore_func   asp                 chemscore  ... goldscore   plp                
tag           best      first           best  ...     first  best      first     
              mean  std  mean  std      mean  ...       std  mean  std  mean  std
dock_func                                     ...                                
asp           65.2  0.7  34.5  2.4      64.0  ...       0.0  64.4  0.7  37.9  0.7
chemscore     59.1  2.0  29.5  2.0      58.0  ...       1.7  58.7  0.7  40.9  2.3
goldscore     68.9  1.7  34.8  4.3      69.7  ...       1.3  68.9  1.3  46.2  0.7
plp           69.3  1.1  35.2  2.0      69.7  ...       2.0  68.9  2.4  39.4  2.9

[4 rows x 16 columns]

Desired output:所需的 output：

rescore_func   asp                             
tag            best       first           ...  
               mean_std   mean_std        ...
dock_func                                 ...                                 
asp            65.2(0.7)  34.5(2.4)       ...
chemscore      59.1(2.0)  29.5(2.0)       ... 
goldscore      68.9(1.7)  34.8(4.3)       ...
plp            69.3(1.1)  35.2(2.0)       ...

[4 rows x 16 columns]

So far I have:到目前为止，我有：

df = df.melt(ignore_index=False).reset_index()
df = df.rename(columns=str).rename(columns={'None':'descr'})

which gives:这使：

    dock_func rescore_func    tag descr  value
0         asp          asp   best  mean   65.2
1   chemscore          asp   best  mean   59.1
2   goldscore          asp   best  mean   68.9
3         plp          asp   best  mean   69.3
4         asp          asp   best   std    0.7
..        ...          ...    ...   ...    ...
59        plp          plp  first  mean   39.4
60        asp          plp  first   std    0.7
61  chemscore          plp  first   std    2.3
62  goldscore          plp  first   std    0.7
63        plp          plp  first   std    2.9

[64 rows x 5 columns]

I am stuck on how to group the means and std together before re-pivoting the data...在重新透视数据之前，我被困在如何将手段和标准组合在一起......

Answer 1

DataFrame.reorder_levels will make it easy for you. DataFrame.reorder_levels会让您轻松。

Here is some sample data:以下是一些示例数据：

import numpy as np
import pandas as pd


index = pd.Index(["asp", "chemscore", "goldscore", "plp"], name="dock_func")
columns = pd.MultiIndex.from_product(
    [index, pd.Index(["best", "fisrt"], name="tag"), ("mean", "std")]
)

df = pd.DataFrame(
    np.random.random(size=(4, 16)),
    index=index,
    columns=columns,
).round(1)

df looks like: df看起来像：

dock_func  asp                 chemscore                 goldscore                  plp
tag       best      fisrt           best      fisrt           best      fisrt      best      fisrt
          mean  std  mean  std      mean  std  mean  std      mean  std  mean  std mean  std  mean  std
dock_func
asp        0.5  0.6   0.4  0.2       0.7  0.7   0.8  0.1       0.2  0.5   0.6  0.7  0.5  0.2   0.2  0.7
chemscore  0.0  0.7   0.9  0.2       0.3  0.3   0.4  0.8       0.3  0.4   0.2  0.8  0.5  0.5   0.4  0.2
goldscore  0.5  0.7   0.8  0.0       0.2  0.8   0.1  0.2       0.6  0.1   0.4  0.2  0.8  0.2   0.8  0.3
plp        1.0  0.6   0.6  0.8       0.8  0.6   0.3  1.0       0.7  0.2   0.8  0.2  0.2  0.2   0.7  0.2

Then just run the following:然后只需运行以下命令：

df = df.reorder_levels([2, 0, 1], axis=1).astype(str)
df = df["mean"] + "(" + df["std"] + ")"

and df is:和df是：

dock_func       asp           chemscore           goldscore                 plp
tag            best     fisrt      best     fisrt      best     fisrt      best     fisrt
dock_func
asp        0.5(0.6)  0.4(0.2)  0.7(0.7)  0.8(0.1)  0.2(0.5)  0.6(0.7)  0.5(0.2)  0.2(0.7)
chemscore  0.0(0.7)  0.9(0.2)  0.3(0.3)  0.4(0.8)  0.3(0.4)  0.2(0.8)  0.5(0.5)  0.4(0.2)
goldscore  0.5(0.7)  0.8(0.0)  0.2(0.8)  0.1(0.2)  0.6(0.1)  0.4(0.2)  0.8(0.2)  0.8(0.3)
plp        1.0(0.6)  0.6(0.8)  0.8(0.6)  0.3(1.0)  0.7(0.2)  0.8(0.2)  0.2(0.2)  0.7(0.2)

如何组合 pandas pivot 表中的列？

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-06-03 09:28:40

如何组合 pandas pivot 表中的列？

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-06-03 09:28:40

解决方案1
1 已采纳 2021-06-03 09:28:40