简体   繁体   中英

Is there a way to make custom function in pandas aggregation function?

Want to apply custom function in a Dataframe eg. Dataframe

    index City  Age 
0   1    A    50    
1   2    A    24    
2   3    B    65    
3   4    A    40     
4   5    B    68    
5   6    B    48    

Function to apply

def count_people_above_60(age):
     **    ***                       #i dont know if the age can or can't be passed as series or list to perform any operation later
     return count_people_above_60 

expecting to do something like

df.groupby(['City']).agg{"AGE" : ["mean",""count_people_above_60"]}

expected Output

City  Mean People_Above_60
 A    38    0
 B    60.33    2

If performance is important create new column filled by compared values converted to integer s, so for count is used aggregation sum :

df = (df.assign(new = df['Age'].gt(60).astype(int))
        .groupby(['City'])
        .agg(Mean= ("Age" , "mean"), People_Above_60= ('new',"sum")))
print (df)
           Mean  People_Above_60
City                            
A     38.000000                0
B     60.333333                2

Your solution should be changed with compare values and sum , but is is slow if many groups or large DataFrame :

def count_people_above_60(age):
    return (age > 60).sum()

df = (df.groupby(['City']).agg(Mean=("Age" , "mean"), 
                               People_Above_60=('Age',count_people_above_60)))
print (df)
           Mean  People_Above_60
City                            
A     38.000000                0
B     60.333333                2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM