I have a DataFrame df
, composed of (age, height)
. I want to see how the mean of height changes with age, so I group df
by age
and try to form a new DataFrame new_df
, composed of (age, mean_height)
, code goes below:
groups = df.groupby('age')
new_df = groups.agg({'height' : np.mean,
'age' : # HOW to add age?})
but I don't know how to append age
to new_df
, hope anyone could give me some advice.
Age is the index of the aggregated dataframe:
In [95]: df = DataFrame({'age':[10,10,20,20,20], 'height':[140,150,145, 190,200]})
In [96]: df
Out[96]:
age height
0 10 140
1 10 150
2 20 145
3 20 190
4 20 200
In [97]: groups = df.groupby('age')
In [98]: groups.agg({'height':np.mean})
Out[98]:
height
age
10 145.000000
20 178.333333
And df.groupby('age').mean()
would achieve the same result. If you want it as a column and not an index, add a call to reset_index()
.
As an alternative, you can call the groupby
with as_index=False
:
groups = df.groupby('age', as_index=False)
groups.agg({'heigt': np.mean})
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.