简体   繁体   中英

Calculate number values from the same column for rows with the same ID and save calculation result as a new column only for the row with latest date

I'm new to Python and trying to calculate the % difference between two numbers from Units column for two different dates and save its result as a value in a new column (My_Calculation_Result). This value should only be present on the rows with latest date.

((Units[where date is 2020-02-01] - Units[where date is 2020-01-25] ) / Units[where date is 2020-01-25]) * 100%

My initial CSV file structure:

Date,       ID,  Name,   Units, 
2020-02-01, 123, Guitar,  200,            
2020-02-01, 456, Drums,   150,            
2020-02-01, 789, Piano,   340,            
2020-01-25, 123, Guitar,  980,            
2020-01-25, 456, Drums,    3,             
2020-01-25, 789, Piano,   300,            

Desired output in CSV: In the output file I need to add the calculation results only to the rows with latest date.

Date,       ID,  Name,   Units,  My_Calculation_Result
2020-02-01, 123, Guitar,  200,            -79.59%
2020-02-01, 456, Drums,   150,             49.00%
2020-02-01, 789, Piano,   340,             11.76%
2020-01-25, 123, Guitar,  980,            
2020-01-25, 456, Drums,    3,             
2020-01-25, 789, Piano,   300, 

Thank you for any help with this in advance!

IIUC:

df['My_Cal_Result'] = df.groupby(['ID']).Units.pct_change(-1)

Output:

         Date   ID    Name  Units  My_Cal_Result
0  2020-02-01  123  Guitar    200      -0.795918
1  2020-02-01  456   Drums    150      49.000000
2  2020-02-01  789   Piano    340       0.133333
3  2020-01-25  123  Guitar    980            NaN
4  2020-01-25  456   Drums      3            NaN
5  2020-01-25  789   Piano    300            NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM