简体   繁体   中英

Is there a way to output df.min, df.max and df.mean in Pandas.groupby on a certain column at once?

So I need to group rows by 'fh_status' column, and then perform min, mean and max of 'gini' for each group (there will be three). I came up with this code:

m = (df2.groupby(['fh_status']).max().iloc[:, 2]) #iloc2 corresponds to gini column
n = (df2.groupby(['fh_status']).min().iloc[:, 2])
e = (df2.groupby(['fh_status']).mean().iloc[:, 2])
nl = '\n'
print(f' mean: {e} {nl} maximum: {m} {nl} minimum:{n}')

output:

mean: fh_status
free           38.170175
not free       39.750000
partly free    43.931250
Name: gini, dtype: float64 
 maximum: fh_status
free           10.0
not free        5.0
partly free     9.0
Name: polity09, dtype: float64 
 minimum:fh_status
free            6.0
not free      -10.0
partly free    -6.0
Name: polity09, dtype: float64

Using these three methods in one string didn't work (AFAIK it prints only the latter command), so three variables came up and they're a bit clumsy. Output seems right, but I'm pretty sure there is a way to optimise this and reduce amount of code. Or isn't it?

Yes, you can use .agg(..) and pass a list of operations:

df2.groupby('fh_status')['gini']

This will produce a dataframe with as columns the aggregates ( min , max , mean ), and as rows the groups (the values over which you made a .groupby(..) ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM