熊猫：转置，分组和汇总列

Question

i have a pandas DataFrame which looks like this: 我有一个像这样的pandas DataFrame：

| Id | Filter 1 | Filter 2 | Filter 3 |
|----|----------|----------|----------|
| 25 | 0        | 1        | 1        |
| 25 | 1        | 0        | 1        |
| 25 | 0        | 0        | 1        |
| 30 | 1        | 0        | 1        |
| 31 | 1        | 0        | 1        |
| 31 | 0        | 1        | 0        |
| 31 | 0        | 0        | 1        |

I need to transpose this table, add "Name" column with the name of the filter and summarize Filters column values. 我需要转置此表，添加带有过滤器名称的“名称”列，并汇总过滤器列值。 The result table should be like this: 结果表应该是这样的：

| Id | Name     | Summ |
| 25 | Filter 1 | 1    |
| 25 | Filter 2 | 1    |
| 25 | Filter 3 | 3    |
| 30 | Filter 1 | 1    |
| 30 | Filter 2 | 0    |
| 30 | Filter 3 | 1    |
| 31 | Filter 1 | 1    |
| 31 | Filter 2 | 1    |
| 31 | Filter 3 | 2    |

The only solution i have came so far was to use apply function on groupped by Id column, but this mehod is too slow for my case - dataset can be more than 40 columns and 50_000 rows, how can i do this with pandas native methods?(eg Pivot, Transpose, Groupby) 我到目前为止唯一的解决方案是使用由Id列分组的应用函数，但这个方法对我的情况来说太慢了 - 数据集可以超过40列和50_000行，我怎么能用pandas本机方法做到这一点？（例如Pivot，Transpose，Groupby）

Answer 1

Use: 采用：

df_new=df.melt('Id',var_name='Name',value_name='Sum').groupby(['Id','Name']).Sum.sum()\
                                                                 .reset_index()
print(df_new)

   Id      Name  Sum
0  25  Filter 1    1
1  25  Filter 2    1
2  25  Filter 3    3
3  30  Filter 1    1
4  30  Filter 2    0
5  30  Filter 3    1
6  31  Filter 1    1
7  31  Filter 2    1
8  31  Filter 3    1

Answer 2

stack then groupby 然后stack groupby

df.set_index('Id').stack().groupby(level=[0,1]).sum().reset_index()
   Id   level_1  0
0  25  Filter 1  1
1  25  Filter 2  1
2  25  Filter 3  3
3  30  Filter 1  1
4  30  Filter 2  0
5  30  Filter 3  1
6  31  Filter 1  1
7  31  Filter 2  1
8  31  Filter 3  1

Short version 简洁版本

df.set_index('Id').sum(level=0).stack()#df.groupby('Id').sum().stack()

Answer 3

Using filter and melt 使用filter和melt

df.filter(like='Filter').groupby(df.Id).sum().T.reset_index().melt(id_vars='index')

    index       Id  value
0   Filter 1    25  1
1   Filter 2    25  1
2   Filter 3    25  3
3   Filter 1    30  1
4   Filter 2    30  0
5   Filter 3    30  1
6   Filter 1    31  1
7   Filter 2    31  1
8   Filter 3    31  2

熊猫：转置，分组和汇总列

问题描述

3 个解决方案

解决方案1
2 已采纳 2019-05-12 16:38:08

解决方案2
1 2019-05-12 16:38:48

解决方案3
0 2019-05-12 16:42:40

熊猫：转置，分组和汇总列

问题描述

3 个解决方案

解决方案1 2 已采纳 2019-05-12 16:38:08

解决方案2 1 2019-05-12 16:38:48

解决方案3 0 2019-05-12 16:42:40

解决方案1
2 已采纳 2019-05-12 16:38:08

解决方案2
1 2019-05-12 16:38:48

解决方案3
0 2019-05-12 16:42:40