具有多级索引的Pandas Dataframe展平交叉表

Question

I have an Excel file which looks like this: 我有一个如下所示的Excel文件：

+-------+-------+-------+-------+-------+-------+
|       | Cat1  | Cat1  | Cat1  | Cat1  | Cat1  |
+-------+-------+-------+-------+-------+-------+
|       | Type1 | Type1 | Type1 | Type1 | Type2 |
+-------+-------+-------+-------+-------+-------+
|       | 2018  | 2018  | 2018  | 2018  | 2018  |
+-------+-------+-------+-------+-------+-------+
| Name  | 1Q    | 2Q    | 3Q    | 4Q    | 1Q    |
+-------+-------+-------+-------+-------+-------+
| Name1 | 1     | 5     | 3     | 5     | 4     |
+-------+-------+-------+-------+-------+-------+
| Name2 | 3     | 23    | 4     | 2     | 4     |
+-------+-------+-------+-------+-------+-------+
| Name3 | 4     | 3     | 5     | 3     | 44    |
+-------+-------+-------+-------+-------+-------+
| Name4 | 3     | 6     | 5     | 4     | 2     |
+-------+-------+-------+-------+-------+-------+

...and so on ...等等

I want to format it so that it looks like this: 我想格式化它，使其看起来像这样：

+-------+------+-------+------+---------+-------+
| Name  | Cat  | Type  | Year | Quarter | Value |
+-------+------+-------+------+---------+-------+
| Name1 | Cat1 | Type1 | 2018 | 1Q      | 5     |
+-------+------+-------+------+---------+-------+
| Name1 | Cat1 | Type1 | 2018 | 2Q      | 3     |
+-------+------+-------+------+---------+-------+
| Name1 | Cat1 | Type1 | 2018 | 3Q      | 5     |
+-------+------+-------+------+---------+-------+
| Name1 | Cat1 | Type1 | 2018 | 4Q      | 4     |
+-------+------+-------+------+---------+-------+
| Name1 | Cat1 | Type2 | 2018 | 1Q      | 6     |
+-------+------+-------+------+---------+-------+

I've loaded it into a pandas DataFrame and am unsure how to proceed now. 我已将其加载到pandas DataFrame中，不确定现在如何进行。 Is it melt, stack, unstack, MultiIndex...? 它是否融化，堆叠，拆栈，MultiIndex ...？

Answer 1

Use stack : 使用stack ：

print (df.columns)
MultiIndex(levels=[['Cat1'], ['Type1', 'Type2'], ['2018'], ['1Q', '2Q', '3Q', '4Q']],
           labels=[[0, 0, 0, 0, 0], [0, 0, 0, 0, 1], [0, 0, 0, 0, 0], [0, 1, 2, 3, 0]])


df = df.stack([0,1,2,3]).reset_index()
df.columns = ['Name','Cat','Type','Year','Quarter','Value']
print (df)
     Name   Cat   Type  Year Quarter  Value
0   Name1  Cat1  Type1  2018      1Q    1.0
1   Name1  Cat1  Type1  2018      2Q    5.0
2   Name1  Cat1  Type1  2018      3Q    3.0
3   Name1  Cat1  Type1  2018      4Q    5.0
4   Name1  Cat1  Type2  2018      1Q    4.0
5   Name2  Cat1  Type1  2018      1Q    3.0
6   Name2  Cat1  Type1  2018      2Q   23.0
7   Name2  Cat1  Type1  2018      3Q    4.0
8   Name2  Cat1  Type1  2018      4Q    2.0
9   Name2  Cat1  Type2  2018      1Q    4.0
10  Name3  Cat1  Type1  2018      1Q    4.0
11  Name3  Cat1  Type1  2018      2Q    3.0
12  Name3  Cat1  Type1  2018      3Q    5.0
13  Name3  Cat1  Type1  2018      4Q    3.0
14  Name3  Cat1  Type2  2018      1Q   44.0
15  Name4  Cat1  Type1  2018      1Q    3.0
16  Name4  Cat1  Type1  2018      2Q    6.0
17  Name4  Cat1  Type1  2018      3Q    5.0
18  Name4  Cat1  Type1  2018      4Q    4.0
19  Name4  Cat1  Type2  2018      1Q    2.0

具有多级索引的Pandas Dataframe展平交叉表

问题描述

1 个解决方案

解决方案1
0 2017-10-19 12:27:30

具有多级索引的Pandas Dataframe展平交叉表

问题描述

1 个解决方案

解决方案1 0 2017-10-19 12:27:30

解决方案1
0 2017-10-19 12:27:30