Pandas add column using groupby dataframe by sorting date column

Question

I have the following dataframe:

ID	Date
1	5/4/2021 8:17
1	5/25/2021 6:20
1	5/2/2021 22:15
2	7/12/2021 2:20
2	7/4/2021 21:28
2
2

For the repeating IDs, i want to sort the date from old to latest and then add a new column which marks increment index for that ID based on the date. And if there is no date for any ID, just add the first index. Following is how I want my new dataframe to look like.

ID	Date	Index
1	5/2/2021 22:15	1
1	5/4/2021 8:17	2
1	5/25/2021 6:20	3
2	7/4/2021 2:20	1
2	7/12/2021 21:28	2
2		1
2		1

Answer 1

Use to_datetime with DataFrame.sort_values first and then GroupBy.cumcount with numpy.where for set 1 if missing values in Date :

df['Date'] = pd.to_datetime(df['Date'])
df = df.sort_values(['ID','Date'], ignore_index=True)

df['Index'] = np.where(df['Date'].notna(), df.groupby('ID').cumcount().add(1), 1)
print (df)
   ID                Date  Index
0   1 2021-05-02 22:15:00      1
1   1 2021-05-04 08:17:00      2
2   1 2021-05-25 06:20:00      3
3   2 2021-07-04 21:28:00      1
4   2 2021-07-12 02:20:00      2
5   2                 NaT      1
6   2                 NaT      1

Pandas add column using groupby dataframe by sorting date column

Question

1 answers

solution1
0 2021-11-04 06:03:33

Pandas add column using groupby dataframe by sorting date column

Question

1 answers

solution1 0 2021-11-04 06:03:33

solution1
0 2021-11-04 06:03:33