如何将聚合函数应用于 Pandas 中数据透视表的所有列

Question

A pivot table is counting the monthly occurrences of a phenomenon.数据透视表正在计算某个现象的每月发生次数。 Here's the simplified sample data followed by the pivot:这是简化的示例数据，后跟枢轴：

+--------+------------+------------+
| ad_id  | entreprise | date       |
+--------+------------+------------+
| 172788 | A          | 2020-01-28 |
| 172931 | A          | 2020-01-26 |
| 172793 | B          | 2020-01-26 |
| 172768 | C          | 2020-01-19 |
| 173219 | C          | 2020-01-14 |
| 173213 | D          | 2020-01-13 |
+--------+------------+------------+

My pivot_table code is the following:我的 pivot_table 代码如下：

my_pivot_table = pd.pivot_table(df[(df['date'] >= some_date) & ['date'] <= some_other_date)], 
                                values=['ad_id'], index=['entreprise'], 
                                columns=['year', 'month'], aggfunc=['count'])

The resulting table looks like this:结果表如下所示：

+-------------+---------+----------+-----+----------+
|             |  2018   |          |     |          |
+-------------+---------+----------+-----+----------+
| entreprise  | january | february | ... | december |
| A           | 12      | 10       | ... | 8        |
| B           | 24      | 12       | ... | 3        |
| ...         | ...     | ...      | ... | ...      |
| D           | 31      | 18       | ... | 24       |
+-------------+---------+----------+-----+----------+

Now, I would like to add a column that gives me the monthly average, and perform other operations such as comparing last month's count to the monthly average of, say, the last 12 months...现在，我想添加一个列来提供月平均值，并执行其他操作，例如将上个月的计数与过去 12 个月的月平均值进行比较......

I tried to fiddle with the aggfunc parameter of the pivot_table, as well as trying to add an average column to the original dataframe, but without success.我试图摆弄 pivot_table 的 aggfunc 参数，并尝试向原始数据帧添加一个平均列，但没有成功。

Thanks in advance!提前致谢！

Answer 1

Because you get Multiindex table after pivot_table you can use:因为您在Multiindex之后获得Multiindex表， pivot_table您可以使用：

df1 = df.mean(axis=1, level=0)
df1.columns = pd.MultiIndex.from_product([df1.columns, ['mean']])

Or:或者：

df2 = df.mean(axis=1, level=1)
df2.columns = pd.MultiIndex.from_product([['all_years'], df2.columns])

如何将聚合函数应用于 Pandas 中数据透视表的所有列

问题描述

1 个解决方案

解决方案1
3 已采纳 2020-01-16 14:17:09

如何将聚合函数应用于 Pandas 中数据透视表的所有列

问题描述

1 个解决方案

解决方案1 3 已采纳 2020-01-16 14:17:09

解决方案1
3 已采纳 2020-01-16 14:17:09