pandas.pivot_table : How to name functions for aggregation

Question

I am trying to pivot pandas DataFrame using several aggregate functions, some of which are lambda. There has to be a distinct name for each column in order to have aggregations by several lambda functions. I tried a few ideas I found online but none worked. This is the minimal example:

df = pd.DataFrame({'col1': [1, 1, 2, 3], 'col2': [4, 4, 5, 6], 'col3': [7, 10, 8, 9]})

pivoted_df = df.pivot_table(index = ['col1', 'col2'], values  = 'col3', aggfunc=[('lam1', lambda x: np.percentile(x, 50)), ('lam2', np.percentile(x, 75)]).reset_index()

The error is

AttributeError: 'SeriesGroupBy' object has no attribute 'lam1'

I tried with dictionary , it also results in error. Can someone help? Thanks!

Answer 1

Name the functions explicitly:

def lam1(x):
    return np.percentile(x, 50)

def lam2(x):
    return np.percentile(x, 75)

pivoted_df = df.pivot_table(index = ['col1', 'col2'], values  = 'col3',
                            aggfunc=[lam1, lam2]).reset_index()

Your aggregation series will then be appropriately named:

print(pivoted_df)

   col1  col2  lam1  lam2
0     1     4   8.5  9.25
1     2     5   8.0  8.00
2     3     6   9.0  9.00

The docs for pd.pivot_table explain why:

aggfunc : function, list of functions, dict, default numpy.mean

If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names ( inferred from the function objects themselves ) If dict is passed, the key is column to aggregate and value is function or list of functions

Answer 2

I suggest use here DataFrameGroupBy.agg :

f1 = lambda x: np.percentile(x, 50)
f2 = lambda x: np.percentile(x, 75)

pivoted_df = (df.groupby(['col1', 'col2'])['col3']
                .agg([('lam1', f1), ('lam2', f2)])
                .reset_index())
print (pivoted_df)
   col1  col2  lam1  lam2
0     1     4   8.5  9.25
1     2     5   8.0  8.00
2     3     6   9.0  9.00

pandas.pivot_table : How to name functions for aggregation

Question

2 answers

solution1
2 ACCPTED 2018-10-18 09:08:20

solution2
2 2018-10-18 09:08:40

pandas.pivot_table : How to name functions for aggregation

Question

2 answers

solution1 2 ACCPTED 2018-10-18 09:08:20

solution2 2 2018-10-18 09:08:40

solution1
2 ACCPTED 2018-10-18 09:08:20

solution2
2 2018-10-18 09:08:40