Count occurrences in last 30 days with Pandas Dataframe

Question

I have a pandas Dataframe with an ID column and a date column (YYYY-MM-DD),

ID	Date
001	2022-01-01
001	2022-01-04
001	2022-02-07
002	2022-01-02
002	2022-01-03
002	2022-01-28

There may be gaps in the date field, as shown. I would like to have a new column, "occurrences_last_month" where it counts the number of occurrences for each ID in the last month (30 days).

The idea would be to add a new column with the following output,

ID	Date	Ocurrences_last_month
001	2022-01-01	0
001	2022-01-04	1
001	2022-02-07	0
002	2022-01-02	0
002	2022-01-03	1
002	2022-01-28	2

For example, in the case of ID 001:

January 1st: substracts 1 month to December 2nd, so 0 occurrences
January 2nd: goes from December 3rd to January 1st, so 1 occurrence
February 7th: goes from January 8th, so 0 occurrences

I tried to use the datetime.timedelta to compute a new column "date_previous_month", but I am unable to compute what I need from there, I tried using count() but I failed to obtain what I need.

Answer 1

First idea is per grouops use Rolling.count with remove first level created by ID :

df = df.set_index('Date')
df['Ocurrences_last_month'] = (df.groupby('ID')
                                 .rolling('30D')
                                 .count().sub(1).droplevel(0).astype(int))
print (df)
            ID  Ocurrences_last_month
Date                                 
2022-01-01   1                      0
2022-01-04   1                      1
2022-02-07   1                      0
2022-01-02   2                      0
2022-01-03   2                      1
2022-01-28   2                      2

EDIT: If possible duplciated values create Series and assign to original DataFrame by DataFrame.join :

s = df.groupby('ID').rolling('30D', on='Date')['Date'].count().sub(1).astype(int)

df = df.join(s.rename('Ocurrences_last_month'), on=['ID','Date'])
print (df)
   ID       Date  Ocurrences_last_month
0   1 2022-01-01                      0
1   1 2022-01-04                      1
2   1 2022-02-07                      0
3   2 2022-01-02                      0
4   2 2022-01-03                      1
5   2 2022-01-28                      2

Alternative solution from comments:

df = df.merge(s.rename('Ocurrences_last_month'), on=['ID','Date'])

Count occurrences in last 30 days with Pandas Dataframe

Question

1 answers

solution1
4 ACCPTED 2022-09-09 10:19:21

Count occurrences in last 30 days with Pandas Dataframe

Question

1 answers

solution1 4 ACCPTED 2022-09-09 10:19:21

solution1
4 ACCPTED 2022-09-09 10:19:21