[英]Pandas Pivot Table List of Aggfunc
Pandas Pivot Table Dictionary of Agg function Pandas Pivot表Agg函数字典
I am trying to calculate 3 aggregative
functions during pivoting: 我试图在旋转期间计算3个
aggregative
函数:
This is the code: 这是代码:
n_page = (pd.pivot_table(Main_DF,
values='SPC_RAW_VALUE',
index=['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'],
columns=['LOT_VIRTUAL_LINE'],
aggfunc={'N': 'count', 'Mean': np.mean, 'Sigma': np.std})
.reset_index()
)
Error I am getting is: KeyError: 'Mean'
我得到的错误是:
KeyError: 'Mean'
How can I calculate those 3 functions? 我该如何计算这3个函数?
As written in approved answer by @Happy001, aggfunc
cant take dict
is false. 正如@ Happy001批准的回答所写,
aggfunc
不能把dict
作为假。 we can actually pass the dict
to aggfunc
. 我们实际上可以将
dict
传递给aggfunc
。
A really handy feature is the ability to pass a dictionary
to the aggfunc
so you can perform different functions on each of the values you select. 一个非常方便的功能是能够将
dictionary
传递给aggfunc
因此您可以对aggfunc
每个值执行不同的功能。 for example: 例如:
import pandas as pd
import numpy as np
df = pd.read_excel('sales-funnel.xlsx') #loading xlsx file
table = pd.pivot_table(df, index=['Manager', 'Status'], columns=['Product'], values=['Quantity','Price'],
aggfunc={'Quantity':len,'Price':[np.sum, np.mean]},fill_value=0)
table
In the above code, I am passing dictionary
to the aggfunc
and performing len
operation on Quantity
and mean
, sum
operations on Price
. 在上面的代码,我传递
dictionary
到aggfunc
和执行len
的操作Quantity
和mean
, sum
操作上的Price
。
Here is the output attaching: 这是附加的输出:
The example is taken from pivot table explained. 该示例取自枢轴表解释。
The aggfunc
argument of pivot_table
takes a function or list of functions but not dict
该
aggfunc
的参数pivot_table
需要一个功能或功能列表中,但没有dict
aggfunc : function, default numpy.mean, or list of functions If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)
aggfunc:function,默认numpy.mean或函数列表如果传递的函数列表,生成的数据透视表将具有分层列,其顶层是函数名称(从函数对象本身推断)
So try 所以试试吧
n_page = (pd.pivot_table(Main_DF,
values='SPC_RAW_VALUE',
index=['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'],
columns=['LOT_VIRTUAL_LINE'],
aggfunc=[len, np.mean, np.std])
.reset_index()
)
You may want to rename the hierarchical columns afterwards. 您可能希望之后重命名分层列。
Try using groupby
尝试使用
groupby
df = (Main_DF
.groupby(['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], as_index=False)
.LOT_VIRTUAL_LINE
.agg({'N': 'count', 'Mean': np.mean, 'Sigma': np.std})
)
Setting as_index=False
just leaves these as columns in your dataframe so you don't have to reset the index afterwards. 设置
as_index=False
只会将这些作为列留在数据as_index=False
,这样您就不必在之后重置索引。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.