Sum of sales for last 30days per user with Python

Question

I'm trying to get new df column named as 'sales_30d_lag' with aggregated sales of last 30 days from last purchase date per user_id. I know how to get the 30 days lag see below for my code but that won't resolve the issue since it is a fixed date.

user_id	purchase_date	product	sales
1	1/1/21	A	1
2	1/1/21	A	1

max_date = max(df['purchase_date'])
df['30d_lag']= pd.to_datetime(df['max_date']) - pd.to_timedelta(30)

I have also used a different approach but that doesn't seem to work either. Any ideas how to get this column?

start_date = pd.to_datetime(df['max_date'])
end_date = start_date - pd.to_timedelta(30)
df_30d_lag = df[df['purchase_date'].between(start_date, end_date)].groupby('user_id').agg({'sales':'sum'}).rename(columns={'sales':'sales_30d_lag'}).reset_index()

Answer 1

You could use combination of isin and pd.date_range functions.

Here's an example:

start_date = pd.to_datetime(df['max_date'])
end_date = start_date - pd.to_timedelta(30)

30_d_df = df[df['datetime_col'].isin(pd.date_range(start_date, end_date, freq='D'))]

# Once the filtration is complete you can use your normal groupby function 
30_d_df.groupby('user_id').agg({'sales':'sum'})

NOTE: For this function to work you need to have datetime_col in datetime (if it already isn't in it).

Sum of sales for last 30days per user with Python

Question

1 answers

solution1
0 2022-01-08 07:27:29

Sum of sales for last 30days per user with Python

Question

1 answers

solution1 0 2022-01-08 07:27:29

solution1
0 2022-01-08 07:27:29