Say I have the following dataframe:
>>> df=pd.DataFrame([[150,90,60],[200,190,10],[400,150,250]], columns=['Total','Group1','Group2'])
>>> df
Total Group1 Group2
0 150 90 60
1 200 190 10
2 400 150 250
>>>
As you can see, Group 1 and Group 2 sum up to the Total (think age categories in Census Data). I want to calculate the percentage within each group.
Right now I'm doing this as follows:
>>> df2=df.copy()
>>> for Group in ['Group1','Group2']:
... df2[Group]=df[Group]/df['Total']*100
...
>>>
>>> df2
Total Group1 Group2
0 150 60.0 40.0
1 200 95.0 5.0
2 400 37.5 62.5
>>>
However, I'm sure there is a way to do this without the for loop. Perhaps using applymap or map? Can someone show me the more efficient way to do this calculation?
You can just divide as follows:
>>> df.div(df.Total.values, axis=0)
Total Group1 Group2
0 1 0.600 0.400
1 1 0.950 0.050
2 1 0.375 0.625
I wouldn't recommend mixing values and percentages, but if you really want to, you can reassign Total
:
df2 = df.div(df.Total.values, axis=0)
df2['Total'] = df.Total
>>> print df.drop('Total', axis=1).divide(df.Total, axis=0)
Group1 Group2
0 0.600 0.400
1 0.950 0.050
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.