I would like to add two columns to a pandas df to show daily totals and weights. Existing df:
Date | Name | Value |
---|---|---|
2022-2-1 | Apple | 5 |
2022-2-1 | Pear | 3 |
2022-2-2 | Apple | 4 |
2022-2-2 | Pear | 4 |
2022-2-2 | Banana | 2 |
2022-2-3 | Apple | 5 |
2022-2-3 | Banana | 3 |
2022-2-4 | Apple | 4 |
2022-2-4 | Banana | 2 |
2022-2-4 | Papaya | 9 |
Desired output:
Date | Name | Value | Daily Total | Percentage Weight |
---|---|---|---|---|
2022-2-1 | Apple | 5 | 8 | 62.5% |
2022-2-1 | Pear | 3 | 8 | 37.5% |
2022-2-2 | Apple | 4 | 10 | 40% |
2022-2-2 | Pear | 4 | 10 | 40% |
2022-2-2 | Banana | 2 | 10 | 20% |
2022-2-3 | Apple | 5 | 8 | 62.5% |
2022-2-3 | Banana | 3 | 8 | 37.5% |
2022-2-4 | Apple | 4 | 15 | 26.7% |
2022-2-4 | Banana | 2 | 15 | 13.3% |
2022-2-4 | Papaya | 9 | 15 | 60% |
Appreciate any ideas/suggestions.
Use groupby
+ transform('sum')
:
df['Daily Total'] = df.groupby('Date')['Value'].transform('sum')
df['Percentage Weight'] = (df['Value'] / df['Daily Total'] * 100).round(1).map('{}%'.format)
Output:
>>> df
Date Name Value Daily Total Percentage Weight
0 2022-2-1 Apple 5 8 62.5%
1 2022-2-1 Pear 3 8 37.5%
2 2022-2-2 Apple 4 10 40.0%
3 2022-2-2 Pear 4 10 40.0%
4 2022-2-2 Banana 2 10 20.0%
5 2022-2-3 Apple 5 8 62.5%
6 2022-2-3 Banana 3 8 37.5%
7 2022-2-4 Apple 4 15 26.7%
8 2022-2-4 Banana 2 15 13.3%
9 2022-2-4 Papaya 9 15 60.0%
One-liner:
df = df.assign(DailyTotal=df.groupby('Date')['Value'].transform('sum')).pipe(lambda x: x.assign(PercentageWeight=df['Value'].div(df['Daily Total']).mul(100).round(1).map('{}%'.format)))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.