I have a df as below:
year and Continent are indexes. hydro_total is a column.
I want to add a column that will have a percentage of contribution of the continent for the given year.
For example: For the year 1971, Africa has 2.04 contribution and America has 48.56 contributions, similarly for Asia, Europe and Oceania. This will repeat for each year.
Here is the data:
{'hydro_total': {(1971, 'Africa'): 1861980.0,
(1971, 'America'): 44127920.0,
(1971, 'Asia'): 14514450.0,
(1971, 'Europe'): 28232850.0,
(1971, 'Oceania'): 2126000.0,
(1972, 'Africa'): 2300750.0,
(1972, 'America'): 47242190.0,
(1972, 'Asia'): 14970150.0,
(1972, 'Europe'): 29427610.0,
(1972, 'Oceania'): 2225000.0}}
If I understand you correctly:
df['contribution'] = df.groupby(level=0)['hydro_total'] \
.transform(lambda g: g / g.sum()) * 100
Result:
hydro_total contribution
1971 Africa 1861980.0 2.049212
America 44127920.0 48.565228
Asia 14514450.0 15.973959
Europe 28232850.0 31.071820
Oceania 2126000.0 2.339781
1972 Africa 2300750.0 2.392485
America 47242190.0 49.125821
Asia 14970150.0 15.567037
Europe 29427610.0 30.600942
Oceania 2225000.0 2.313715
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.