简体   繁体   中英

pandas pivot_table percentile / quantile

Is it possible to use percentile or quantile as the aggfunc in a pandas pivot table? I've tried both numpy.percentile and pandas quantile without success.

Dummy data:

In [135]: df = pd.DataFrame([['a',2,3],
                             ['a',5,6],
                             ['a',7,8], 
                             ['b',9,10], 
                             ['b',11,12], 
                             ['b',13,14]], columns=list('abc'))

np.percentile seems to work just fine?

In [140]: df.pivot_table(columns='a', aggfunc=lambda x: np.percentile(x, 50))
Out[140]: 
a  a   b
b  5  11
c  6  12

The lambda function solutions works, but produces column names of "<lambda_0>", etc. which need to be renamed later.

Instead of using a lambda (ie unnamed function), we could alternatively define our own functions. They should operate on a Series of values.

df = pd.DataFrame([['a',2,3],
                   ['a',5,6],
                   ['a',7,8], 
                   ['b',9,10], 
                   ['b',11,12], 
                   ['b',13,14]], columns=list('abc'))
def quantile_25(growth_vals:pd.Series):
    return growth_vals.quantile(.25)

def quantile_75(growth_vals:pd.Series):
    return growth_vals.quantile(.75)


df.pivot_table(columns='a', aggfunc=[quantile_25, np.median, quantile_75])

The resulting column names will correspond with the function names.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM