Pandas pivot 表到 dataframe

Question

我有一个 pivot 表（pt），如下所示：

+---------+------------+-------+----+
|         | ZY         | z     |  y |  
+---------+------------+-------+----+
| period_s| ZONE       |       |    | 
+---------+------------+-------+----+
| 201901  | A          | 14    | 34 |
|         | B          | 232   | 9  |
|         | C          | 12    | 2  |
+---------+------------+-------+----+
| 201902  | A          | 196   | 70 |
|         | K          | 10    | 1  |
|         | D          | 313   | 99 |
+---------+------------+-------+----+

它来自使用以下代码的 dataframe (df)：

pt=df.pivot_table(index=['period_s','ZONE'], columns='ZY', values='ID', aggfunc="count")

其中 ZY 字段有两个类 z 和 y。

我尝试使用

df = table.reset_index()

还

df.columns = df.columns.droplevel(0) #remove amount
df.columns.name = None               #remove categories
df = df.reset_index()

As mentioned here transform pandas pivot table to regular dataframe and like this one Convert pivot tables to dataframe

我想要一个像这样的 dataframe：

+---------+-------+------------+----------+
| period_s| ZONE  |    z       | y        |
+---------+-------+------------+----------+
|  201901 |     A | 14         |       34 |
|  201901 |     B | 232        |        9 |
|  201901 |     C | 12         |        2 |
|  201902 |     A | 196        |       70 |
|  201902 |     K | 10         |        1 |
|  201902 |     D | 313        |       99 |
+---------+-------+------------+----------+

Answer 1

这有点晚了，但我认为摆脱pt.columns.name （即"ZY" ）并重置索引将返回预期的 output。 方法链（ set_axis()或rename_axis()摆脱columns.name和reset_index()将period_s和ZONE转换为列）。

pt.set_axis(pt.columns.tolist(), axis=1).reset_index()
#pt.rename_axis(None, axis=1).reset_index()

更直接的方法是reset_index()并明确删除columns.name 。

pt.reset_index(inplace=True)
pt.columns.name = None

一个可重现的例子：

import numpy as np
df = pd.DataFrame({'period_s': np.random.choice([201901, 201902], size=100),
                   'ZONE': np.random.choice([*'ABC'], size=100),
                   'ZY': np.random.choice([*'zy'], size=100),
                   'ID': np.arange(100)})
pt = df.pivot_table(index=['period_s','ZONE'], columns='ZY', values='ID', aggfunc="count")

# output
pt.set_axis(pt.columns.tolist(), axis=1).reset_index()

Pandas pivot 表到 dataframe

问题描述

1 个解决方案

解决方案1
0 2022-08-13 14:54:06

Pandas pivot 表到 dataframe

问题描述

1 个解决方案

解决方案1 0 2022-08-13 14:54:06

解决方案1
0 2022-08-13 14:54:06