简体   繁体   English

Pandas Pivot表Aggfunc列表

[英]Pandas Pivot Table List of Aggfunc

Pandas Pivot Table Dictionary of Agg function Pandas Pivot表Agg函数字典

I am trying to calculate 3 aggregative functions during pivoting: 我试图在旋转期间计算3个aggregative函数:

  1. Count 计数
  2. Mean 意思
  3. StDev 发网

This is the code: 这是代码:

n_page = (pd.pivot_table(Main_DF, 
                         values='SPC_RAW_VALUE',  
                         index=['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], 
                         columns=['LOT_VIRTUAL_LINE'],
                         aggfunc={'N': 'count', 'Mean': np.mean, 'Sigma': np.std})
          .reset_index()
         )

Error I am getting is: KeyError: 'Mean' 我得到的错误是: KeyError: 'Mean'

How can I calculate those 3 functions? 我该如何计算这3个函数?

As written in approved answer by @Happy001, aggfunc cant take dict is false. 正如@ Happy001批准的回答所写, aggfunc不能把dict作为假。 we can actually pass the dict to aggfunc . 我们实际上可以将dict传递给aggfunc

A really handy feature is the ability to pass a dictionary to the aggfunc so you can perform different functions on each of the values you select. 一个非常方便的功能是能够将dictionary传递给aggfunc因此您可以对aggfunc每个值执行不同的功能。 for example: 例如:

import pandas as pd
import numpy as np

df = pd.read_excel('sales-funnel.xlsx')  #loading xlsx file

table = pd.pivot_table(df, index=['Manager', 'Status'], columns=['Product'], values=['Quantity','Price'],
           aggfunc={'Quantity':len,'Price':[np.sum, np.mean]},fill_value=0)
table

In the above code, I am passing dictionary to the aggfunc and performing len operation on Quantity and mean , sum operations on Price . 在上面的代码,我传递dictionaryaggfunc和执行len的操作Quantitymeansum操作上的Price

Here is the output attaching: 这是附加的输出:

在此输入图像描述

The example is taken from pivot table explained. 该示例取自枢轴表解释。

The aggfunc argument of pivot_table takes a function or list of functions but not dict aggfunc的参数pivot_table需要一个功能或功能列表中,但没有dict

aggfunc : function, default numpy.mean, or list of functions If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves) aggfunc:function,默认numpy.mean或函数列表如果传递的函数列表,生成的数据透视表将具有分层列,其顶层是函数名称(从函数对象本身推断)

So try 所以试试吧

n_page = (pd.pivot_table(Main_DF, 
                         values='SPC_RAW_VALUE',  
                         index=['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], 
                         columns=['LOT_VIRTUAL_LINE'],
                         aggfunc=[len, np.mean, np.std])
          .reset_index()
         )

You may want to rename the hierarchical columns afterwards. 您可能希望之后重命名分层列。

Try using groupby 尝试使用groupby

df = (Main_DF
      .groupby(['ALIAS', 'SPC_PRODUCT', 'LABLE', 'RAW_PARAMETER_NAME'], as_index=False)
      .LOT_VIRTUAL_LINE
      .agg({'N': 'count', 'Mean': np.mean, 'Sigma': np.std})
     )

Setting as_index=False just leaves these as columns in your dataframe so you don't have to reset the index afterwards. 设置as_index=False只会将这些作为列留在数据as_index=False ,这样您就不必在之后重置索引。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM