简体   繁体   中英

Python pandas groupby aggregation

I have a DataFrame df , composed of (age, height) . I want to see how the mean of height changes with age, so I group df by age and try to form a new DataFrame new_df , composed of (age, mean_height) , code goes below:

groups = df.groupby('age')
new_df = groups.agg({'height' : np.mean,
                     'age' : # HOW to add age?})

but I don't know how to append age to new_df , hope anyone could give me some advice.

Age is the index of the aggregated dataframe:

In [95]: df = DataFrame({'age':[10,10,20,20,20], 'height':[140,150,145, 190,200]})

In [96]: df
Out[96]: 
   age  height
0   10     140
1   10     150
2   20     145
3   20     190
4   20     200

In [97]: groups = df.groupby('age')

In [98]: groups.agg({'height':np.mean})
Out[98]: 
         height
age            
10   145.000000
20   178.333333

And df.groupby('age').mean() would achieve the same result. If you want it as a column and not an index, add a call to reset_index() .

As an alternative, you can call the groupby with as_index=False :

groups = df.groupby('age', as_index=False)
groups.agg({'heigt': np.mean})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM