Is there a way to make custom function in pandas aggregation function?

Question

Want to apply custom function in a Dataframe eg. Dataframe

    index City  Age 
0   1    A    50    
1   2    A    24    
2   3    B    65    
3   4    A    40     
4   5    B    68    
5   6    B    48

Function to apply

def count_people_above_60(age):
     **    ***                       #i dont know if the age can or can't be passed as series or list to perform any operation later
     return count_people_above_60

expecting to do something like

df.groupby(['City']).agg{"AGE" : ["mean",""count_people_above_60"]}

expected Output

City  Mean People_Above_60
 A    38    0
 B    60.33    2

Answer 1

If performance is important create new column filled by compared values converted to integer s, so for count is used aggregation sum :

df = (df.assign(new = df['Age'].gt(60).astype(int))
        .groupby(['City'])
        .agg(Mean= ("Age" , "mean"), People_Above_60= ('new',"sum")))
print (df)
           Mean  People_Above_60
City                            
A     38.000000                0
B     60.333333                2

Your solution should be changed with compare values and sum , but is is slow if many groups or large DataFrame :

def count_people_above_60(age):
    return (age > 60).sum()

df = (df.groupby(['City']).agg(Mean=("Age" , "mean"), 
                               People_Above_60=('Age',count_people_above_60)))
print (df)
           Mean  People_Above_60
City                            
A     38.000000                0
B     60.333333                2

Is there a way to make custom function in pandas aggregation function?

Question

1 answers

solution1
2 ACCPTED 2020-03-24 09:58:26

Is there a way to make custom function in pandas aggregation function?

Question

1 answers

solution1 2 ACCPTED 2020-03-24 09:58:26

solution1
2 ACCPTED 2020-03-24 09:58:26