[英]How to combine columns in pandas pivot table?
I have a pivot table with 3 levels of columns.我有一个包含 3 级列的 pivot 表。 For each unique mean and std, I want to combine them into a str
f"{x.mean}({x.std})"
replacing the mean and std columns with the new mean_std_str column.对于每个唯一的均值和标准,我想将它们组合成一个 str
f"{x.mean}({x.std})"
用新的 mean_std_str 列替换均值和标准列。
Heres a print of the dataframe:这是 dataframe 的打印件:
rescore_func asp chemscore ... goldscore plp
tag best first best ... first best first
mean std mean std mean ... std mean std mean std
dock_func ...
asp 65.2 0.7 34.5 2.4 64.0 ... 0.0 64.4 0.7 37.9 0.7
chemscore 59.1 2.0 29.5 2.0 58.0 ... 1.7 58.7 0.7 40.9 2.3
goldscore 68.9 1.7 34.8 4.3 69.7 ... 1.3 68.9 1.3 46.2 0.7
plp 69.3 1.1 35.2 2.0 69.7 ... 2.0 68.9 2.4 39.4 2.9
[4 rows x 16 columns]
Desired output:所需的 output:
rescore_func asp
tag best first ...
mean_std mean_std ...
dock_func ...
asp 65.2(0.7) 34.5(2.4) ...
chemscore 59.1(2.0) 29.5(2.0) ...
goldscore 68.9(1.7) 34.8(4.3) ...
plp 69.3(1.1) 35.2(2.0) ...
[4 rows x 16 columns]
So far I have:到目前为止,我有:
df = df.melt(ignore_index=False).reset_index()
df = df.rename(columns=str).rename(columns={'None':'descr'})
which gives:这使:
dock_func rescore_func tag descr value
0 asp asp best mean 65.2
1 chemscore asp best mean 59.1
2 goldscore asp best mean 68.9
3 plp asp best mean 69.3
4 asp asp best std 0.7
.. ... ... ... ... ...
59 plp plp first mean 39.4
60 asp plp first std 0.7
61 chemscore plp first std 2.3
62 goldscore plp first std 0.7
63 plp plp first std 2.9
[64 rows x 5 columns]
I am stuck on how to group the means and std together before re-pivoting the data...在重新透视数据之前,我被困在如何将手段和标准组合在一起......
DataFrame.reorder_levels
will make it easy for you. DataFrame.reorder_levels
会让您轻松。
Here is some sample data:以下是一些示例数据:
import numpy as np
import pandas as pd
index = pd.Index(["asp", "chemscore", "goldscore", "plp"], name="dock_func")
columns = pd.MultiIndex.from_product(
[index, pd.Index(["best", "fisrt"], name="tag"), ("mean", "std")]
)
df = pd.DataFrame(
np.random.random(size=(4, 16)),
index=index,
columns=columns,
).round(1)
df
looks like: df
看起来像:
dock_func asp chemscore goldscore plp
tag best fisrt best fisrt best fisrt best fisrt
mean std mean std mean std mean std mean std mean std mean std mean std
dock_func
asp 0.5 0.6 0.4 0.2 0.7 0.7 0.8 0.1 0.2 0.5 0.6 0.7 0.5 0.2 0.2 0.7
chemscore 0.0 0.7 0.9 0.2 0.3 0.3 0.4 0.8 0.3 0.4 0.2 0.8 0.5 0.5 0.4 0.2
goldscore 0.5 0.7 0.8 0.0 0.2 0.8 0.1 0.2 0.6 0.1 0.4 0.2 0.8 0.2 0.8 0.3
plp 1.0 0.6 0.6 0.8 0.8 0.6 0.3 1.0 0.7 0.2 0.8 0.2 0.2 0.2 0.7 0.2
Then just run the following:然后只需运行以下命令:
df = df.reorder_levels([2, 0, 1], axis=1).astype(str)
df = df["mean"] + "(" + df["std"] + ")"
and df
is:和
df
是:
dock_func asp chemscore goldscore plp
tag best fisrt best fisrt best fisrt best fisrt
dock_func
asp 0.5(0.6) 0.4(0.2) 0.7(0.7) 0.8(0.1) 0.2(0.5) 0.6(0.7) 0.5(0.2) 0.2(0.7)
chemscore 0.0(0.7) 0.9(0.2) 0.3(0.3) 0.4(0.8) 0.3(0.4) 0.2(0.8) 0.5(0.5) 0.4(0.2)
goldscore 0.5(0.7) 0.8(0.0) 0.2(0.8) 0.1(0.2) 0.6(0.1) 0.4(0.2) 0.8(0.2) 0.8(0.3)
plp 1.0(0.6) 0.6(0.8) 0.8(0.6) 0.3(1.0) 0.7(0.2) 0.8(0.2) 0.2(0.2) 0.7(0.2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.