[英]Pandas - Pivot table while merging with two columns
I have a table with two categorical columns and several numerical ones.我有一个包含两个分类列和几个数字列的表格。 I wish to pivot the table on a certain index (in the example the index is the client) while merging the two categorical columns into the names of the numerical ones.我希望在将两个分类列合并到数字列的名称中的同时,在某个索引上旋转表(在示例中,索引是客户端)。 Is there a way to do this without a loop?有没有办法在没有循环的情况下做到这一点?
This is an example table :这是一个示例表:
Client客户 | Project项目 | Case案件 | Spending开支 | value_mean value_mean | value_std value_std |
---|---|---|---|---|---|
MisterA A先生 | Project_15项目_15 | 4 4 | 100122 100122 | 9655.566667 9655.566667 | 12.70499327 12.70499327 |
MisterA A先生 | Project_15项目_15 | 2 2 | 4015 4015 | 9653 9653 | 17.66352173 17.66352173 |
MisterA A先生 | Project_15项目_15 | 7 7 | 8953 8953 | 94.4 94.4 | 2.065591118 2.065591118 |
MisterA A先生 | Project_16项目_16 | 1 1 | 3922 3922 | 65519.4 65519.4 | 0.894427191 0.894427191 |
MisterA A先生 | Project_16项目_16 | 6 6 | 8953 8953 | 21093.5 21093.5 | 17816.03006 17816.03006 |
MisterA A先生 | Project_16项目_16 | 7 7 | 8953 8953 | 30665.3 30665.3 | 30643.27374 30643.27374 |
MisterA A先生 | Project_16项目_16 | 2 2 | 4015 4015 | 65517.8 65517.8 | 1.788854382 1.788854382 |
MisterA A先生 | Project_16项目_16 | 4 4 | 100122 100122 | 65518.86667 65518.86667 | 1.153402392 1.153402392 |
MisterA A先生 | Project_16项目_16 | 5 5 | 3109 3109 | 65519 65519 | 1.632993162 1.632993162 |
MisterA A先生 | Project_18项目_18 | 4 4 | 100122 100122 | 78.84444444 78.84444444 | 16.89884719 16.89884719 |
MisterA A先生 | Project_18项目_18 | 5 5 | 3109 3109 | 5820 5820 | 6735.594059 6735.594059 |
MisterB B先生 | Project_15项目_15 | 7 7 | 9063 9063 | 94.6 94.6 | 1.646545205 1.646545205 |
MisterB B先生 | Project_15项目_15 | 2 2 | 4015 4015 | 9636 9636 | 14.38749457 14.38749457 |
MisterB B先生 | Project_15项目_15 | 6 6 | 8968 8968 | 93.6 93.6 | 1.264911064 1.264911064 |
MisterB B先生 | Project_16项目_16 | 5 5 | 4016 4016 | 65519 65519 | 1.414213562 1.414213562 |
MisterB B先生 | Project_16项目_16 | 6 6 | 8968 8968 | 22375.3 22375.3 | 16701.95844 16701.95844 |
MisterB B先生 | Project_16项目_16 | 7 7 | 9063 9063 | 36482.5 36482.5 | 31091.74401 31091.74401 |
MisterB B先生 | Project_16项目_16 | 4 4 | 98966 98966 | 65518.78 65518.78 | 1.133333333 1.133333333 |
MisterB B先生 | Project_18项目_18 | 1 1 | 2906 2906 | 79.5 79.5 | 1.914854216 1.914854216 |
MisterB B先生 | Project_18项目_18 | 5 5 | 4016 4016 | 6257 6257 | 6399.977109 6399.977109 |
MisterB B先生 | Project_18项目_18 | 6 6 | 8968 8968 | 13304.3 13304.3 | 52.38330947 52.38330947 |
MisterB B先生 | Project_18项目_18 | 2 2 | 4015 4015 | 78.8 78.8 | 1.095445115 1.095445115 |
I want it something like this:我想要这样的东西:
Client客户 | Project_15_Case_4_Spending Project_15_Case_4_Spending | Project_15_Case_4_value_mean Project_15_Case_4_value_mean | Project_15_Case_4_value_std Project_15_Case_4_value_std | Project_15_Case_2_Spending Project_15_Case_2_Spending | Project_15_Case_2_value_mean Project_15_Case_2_value_mean | Project_15_Case_2_value_std Project_15_Case_2_value_std | ... ... |
---|---|---|---|---|---|---|---|
MisterA A先生 | 100122 100122 | 9655.566667 9655.566667 | 12.70499327 12.70499327 | 4015 4015 | 9653 9653 | 17.66352173 17.66352173 | ... ... |
MisterB B先生 | 9063 9063 | 94.6 94.6 | 1.646545205 1.646545205 | 4015 4015 | 9636 9636 | 14.38749457 14.38749457 | ... ... |
Thank you for you help.谢谢你的帮助。
One way -> use pivot_table
and then rename the columns
.一种方法 -> 使用pivot_table
然后重命名columns
。 You can use fill_Value=0
if required.如果需要,您可以使用fill_Value=0
。
df1 = df.pivot_table(index='Client', columns=['Project', 'Case'], values=[
'Spending', 'value_mean', 'value_std'])
df1.columns = [f'{j}_Case_{k}_{i}' for i, j, k in df1.columns]
Use .stack()
+ .unstack()
: you can then keep the column sequence following original row by row 3 columns sequence, as shown in the column sequence in expected output:使用.stack()
+ .unstack()
:然后您可以将列序列保持在原始行 3 列序列之后,如预期输出中的列序列所示:
df2 = df.set_index(['Client', 'Project', 'Case']).stack().unstack([1,2]).unstack()
df2.columns = df2.columns.map(lambda x: f'{x[0]}_{x[1]}_{x[2]}')
df2 = df2.reset_index()
Result:结果:
print(df2)
Client Project_15_4_Spending Project_15_4_value_mean Project_15_4_value_std Project_15_2_Spending Project_15_2_value_mean Project_15_2_value_std Project_15_7_Spending Project_15_7_value_mean Project_15_7_value_std Project_16_1_Spending Project_16_1_value_mean Project_16_1_value_std Project_16_6_Spending Project_16_6_value_mean Project_16_6_value_std Project_16_7_Spending Project_16_7_value_mean Project_16_7_value_std Project_16_2_Spending Project_16_2_value_mean Project_16_2_value_std Project_16_4_Spending Project_16_4_value_mean Project_16_4_value_std Project_16_5_Spending Project_16_5_value_mean Project_16_5_value_std Project_18_4_Spending Project_18_4_value_mean Project_18_4_value_std Project_18_5_Spending Project_18_5_value_mean Project_18_5_value_std Project_15_6_Spending Project_15_6_value_mean Project_15_6_value_std Project_18_1_Spending Project_18_1_value_mean Project_18_1_value_std Project_18_6_Spending Project_18_6_value_mean Project_18_6_value_std Project_18_2_Spending Project_18_2_value_mean Project_18_2_value_std
0 MisterA 100122.0 9655.566667 12.704993 4015.0 9653.0 17.663522 8953.0 94.4 2.065591 3922.0 65519.4 0.894427 8953.0 21093.5 17816.03006 8953.0 30665.3 30643.27374 4015.0 65517.8 1.788854 100122.0 65518.86667 1.153402 3109.0 65519.0 1.632993 100122.0 78.844444 16.898847 3109.0 5820.0 6735.594059 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 MisterB NaN NaN NaN 4015.0 9636.0 14.387495 9063.0 94.6 1.646545 NaN NaN NaN 8968.0 22375.3 16701.95844 9063.0 36482.5 31091.74401 NaN NaN NaN 98966.0 65518.78000 1.133333 4016.0 65519.0 1.414214 NaN NaN NaN 4016.0 6257.0 6399.977109 8968.0 93.6 1.264911 2906.0 79.5 1.914854 8968.0 13304.3 52.383309 4015.0 78.8 1.095445
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.