I have the following dataframe:
df = pd.DataFrame( columns = ['Name','Status','Profit','Promotion','Product','Visits'])
df['Name'] = ['Andy','Andy','Brad','Brad','Cynthia','Cynthia']
df['Status'] =['Old','New','Old','New','Old','New']
df['Profit'] = [140,60,110,90,20,100]
df['Promotion'] = [25,30,40,10,22,36]
df['Product'] = [8,6,18,10,7,12]
df['Visits'] = [11,4,7,3,12,5]
df['Month'] = 'Jan'
I would like to work out the percentage of total for the columns 'Profit','Promotion' and 'Product' by 'Name' in order to achieve the following dataframe:
df['Profit'] = [70,30,55,45,17,83]
df['Promotion'] = [45,55,80,20,38,62]
df['Product'] = [57,43,64,36,37,63]
df
I have attempted to group by 'Name','Status' and 'Month' and tried doing something similar to the solution provided here Pandas percentage of total with groupby but can't seem to get my desired output.
Use GroupBy.transform
for sum per Name
s with divide original columns, multiple by 100 and last round
:
cols = ['Profit','Promotion','Product']
print (df.groupby('Name')[cols].transform('sum'))
Profit Promotion Product
0 200 55 14
1 200 55 14
2 200 50 28
3 200 50 28
4 120 58 19
5 120 58 19
df[cols] = df[cols].div(df.groupby('Name')[cols].transform('sum')).mul(100).round()
print (df)
Name Status Profit Promotion Product Visits Month
0 Andy Old 70.0 45.0 57.0 11 Jan
1 Andy New 30.0 55.0 43.0 4 Jan
2 Brad Old 55.0 80.0 64.0 7 Jan
3 Brad New 45.0 20.0 36.0 3 Jan
4 Cynthia Old 17.0 38.0 37.0 12 Jan
5 Cynthia New 83.0 62.0 63.0 5 Jan
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.