I have the below dataframe called df:
Id | Stage1 | Stage2 | Stage3 |
---|---|---|---|
1 | 2022-02-01 | 2020-04-03 | 2022-06-07 |
--- | ------------ | ------------ | ----------- |
2 | 2023-06-07 | 2020-03-01 | 2020-09-03 |
--- | ------------ | ------------ | ----------- |
3 | 2023-02-04 | 2023-06-07 | 2022-06-07 |
--- | ------------ | ------------ | ----------- |
4 | 2022-05-08 | 2023-09-01 | 2023-09-01 |
I need to calculate the max date for each ID and its respective Stage. So for Order 1,2,3,4 the Stages I need are Stage 3, Stage 1, Stage 2, and Stage 3 respectively. I understand that using
df.filter(like="stage").idxmax(axis=1)
Finds the first occurrence of max date in a row and gives me its column name. However, for Order 4, Stage 2 and 3 have the same date. I need Stage 3 as my answer as Stage 3 is the latest stage of the order. How is this possible?
Swap order of columns for match latest maximal value:
df.filter(like="stage").iloc[:, ::-1].idxmax(axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.