Want to apply custom function in a Dataframe eg. Dataframe
index City Age
0 1 A 50
1 2 A 24
2 3 B 65
3 4 A 40
4 5 B 68
5 6 B 48
Function to apply
def count_people_above_60(age):
** *** #i dont know if the age can or can't be passed as series or list to perform any operation later
return count_people_above_60
expecting to do something like
df.groupby(['City']).agg{"AGE" : ["mean",""count_people_above_60"]}
expected Output
City Mean People_Above_60
A 38 0
B 60.33 2
If performance is important create new column filled by compared values converted to integer
s, so for count is used aggregation sum
:
df = (df.assign(new = df['Age'].gt(60).astype(int))
.groupby(['City'])
.agg(Mean= ("Age" , "mean"), People_Above_60= ('new',"sum")))
print (df)
Mean People_Above_60
City
A 38.000000 0
B 60.333333 2
Your solution should be changed with compare values and sum
, but is is slow if many groups or large DataFrame
:
def count_people_above_60(age):
return (age > 60).sum()
df = (df.groupby(['City']).agg(Mean=("Age" , "mean"),
People_Above_60=('Age',count_people_above_60)))
print (df)
Mean People_Above_60
City
A 38.000000 0
B 60.333333 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.