I'm new to Python and trying to calculate the % difference between two numbers from Units column for two different dates and save its result as a value in a new column (My_Calculation_Result). This value should only be present on the rows with latest date.
((Units[where date is 2020-02-01] - Units[where date is 2020-01-25] ) / Units[where date is 2020-01-25]) * 100%
My initial CSV file structure:
Date, ID, Name, Units,
2020-02-01, 123, Guitar, 200,
2020-02-01, 456, Drums, 150,
2020-02-01, 789, Piano, 340,
2020-01-25, 123, Guitar, 980,
2020-01-25, 456, Drums, 3,
2020-01-25, 789, Piano, 300,
Desired output in CSV: In the output file I need to add the calculation results only to the rows with latest date.
Date, ID, Name, Units, My_Calculation_Result
2020-02-01, 123, Guitar, 200, -79.59%
2020-02-01, 456, Drums, 150, 49.00%
2020-02-01, 789, Piano, 340, 11.76%
2020-01-25, 123, Guitar, 980,
2020-01-25, 456, Drums, 3,
2020-01-25, 789, Piano, 300,
Thank you for any help with this in advance!
IIUC:
df['My_Cal_Result'] = df.groupby(['ID']).Units.pct_change(-1)
Output:
Date ID Name Units My_Cal_Result
0 2020-02-01 123 Guitar 200 -0.795918
1 2020-02-01 456 Drums 150 49.000000
2 2020-02-01 789 Piano 340 0.133333
3 2020-01-25 123 Guitar 980 NaN
4 2020-01-25 456 Drums 3 NaN
5 2020-01-25 789 Piano 300 NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.