I need to express the percentage of a whole that each row in my data amounts to. Trick is I need the percentage to be bound by the parent grouping from a groupby call. My DF currently look like this:
category Segment Pageviews
Sitting Age 25-34 2268
Age 35-44 2942
Age 45-53 2209
Age 55+ 3317
Standing Age 25-34 2193
Age 35-44 1664
Age 45-53 1874
Age 55+ 1647
Kneeling Age 25-34 680
Age 35-44 494
Age 45-53 876
Age 55+ 1489
What I am hoping to achieve is a % for each age range in Sitting, Standing, and Kneeling respectively.
ie
category Segment Pageviews Percentage
Sitting Age 25-34 2268 21%
Age 35-44 2942 27%
Age 45-53 2209 20%
Age 55+ 3317 31%
Standing Age 25-34 2193 ...
Age 35-44 1664 ...
Age 45-53 1874 ...
Age 55+ 1647
Kneeling Age 25-34 680
Age 35-44 494
Age 45-53 876
Age 55+ 1489
You can use:
>>> df['Percentage'] = df.groupby('category')['Pageviews']\
.apply(lambda g: 100*g / g.sum())
category Segment Pageviews Percentage
0 Sitting Age25-34 2268 21.125186
1 Sitting Age35-44 2942 27.403130
2 Sitting Age45-53 2209 20.575633
3 Sitting Age55+ 3317 30.896051
4 Standing Age25-34 2193 29.723502
5 Standing Age35-44 1664 22.553538
6 Standing Age45-53 1874 25.399837
7 Standing Age55+ 1647 22.323123
8 Kneeling Age25-34 680 19.214467
9 Kneeling Age35-44 494 13.958745
10 Kneeling Age45-53 876 24.752755
11 Kneeling Age55+ 1489 42.074032
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.