简体   繁体   中英

Python Pandas - how to add columns of filtered sum and calculate percentage weight

I would like to add two columns to a pandas df to show daily totals and weights. Existing df:

Date Name Value
2022-2-1 Apple 5
2022-2-1 Pear 3
2022-2-2 Apple 4
2022-2-2 Pear 4
2022-2-2 Banana 2
2022-2-3 Apple 5
2022-2-3 Banana 3
2022-2-4 Apple 4
2022-2-4 Banana 2
2022-2-4 Papaya 9

Desired output:

Date Name Value Daily Total Percentage Weight
2022-2-1 Apple 5 8 62.5%
2022-2-1 Pear 3 8 37.5%
2022-2-2 Apple 4 10 40%
2022-2-2 Pear 4 10 40%
2022-2-2 Banana 2 10 20%
2022-2-3 Apple 5 8 62.5%
2022-2-3 Banana 3 8 37.5%
2022-2-4 Apple 4 15 26.7%
2022-2-4 Banana 2 15 13.3%
2022-2-4 Papaya 9 15 60%

Appreciate any ideas/suggestions.

Use groupby + transform('sum') :

df['Daily Total'] = df.groupby('Date')['Value'].transform('sum')
df['Percentage Weight'] = (df['Value'] / df['Daily Total'] * 100).round(1).map('{}%'.format)

Output:

>>> df
       Date    Name  Value  Daily Total Percentage Weight
0  2022-2-1   Apple      5            8             62.5%
1  2022-2-1    Pear      3            8             37.5%
2  2022-2-2   Apple      4           10             40.0%
3  2022-2-2    Pear      4           10             40.0%
4  2022-2-2  Banana      2           10             20.0%
5  2022-2-3   Apple      5            8             62.5%
6  2022-2-3  Banana      3            8             37.5%
7  2022-2-4   Apple      4           15             26.7%
8  2022-2-4  Banana      2           15             13.3%
9  2022-2-4  Papaya      9           15             60.0%

One-liner:

df = df.assign(DailyTotal=df.groupby('Date')['Value'].transform('sum')).pipe(lambda x: x.assign(PercentageWeight=df['Value'].div(df['Daily Total']).mul(100).round(1).map('{}%'.format)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM