More efficient way to group by and count values Pandas dataframe

Question

A more efficient way to do this?

I have a sales records imported from a spreadsheet. I start by importing that list to a dataframe. I then need to get the average orders per customer by month and year. The spreadsheet does not contain counts, just order and customer ID. So I have to count each ID then get drop duplicates and then reset index. Final dataframe is exported back into a spreadsheet and SQL database.

The code below works, and I get the desiered output, but it seems it should be more efficient?? I am new to pandas and python so I'm sure I could do this better.

df_customers = df.filter(
    ['Month', 'Year', 'Order_Date', 'Customer_ID', 'Customer_Name', 'Patient_ID', 'Order_ID'], axis=1)
df_order_count = df.filter(
    ['Month', 'Year'], axis=1)

df_order_count['Order_cnt'] = df_customers.groupby(['Month', 'Year'])['Order_ID'].transform('nunique')
df_order_count['Customer_cnt'] = df_customers.groupby(['Month', 'Year'])['Customer_ID'].transform('nunique')
df_order_count['Avg'] = (df_order_count['Order_cnt'] / df_order_count['Costumer_cnt']).astype(float).round(decimals=2)
df_order_count = df_order_count.drop_duplicates().reset_index(drop=True)

Answer 1

Try this

g = df.groupby(['Month', 'Year'])
df_order_count['Avg'] = g['Order_ID'].transform('nunique')/g['Customer_ID'].transform('nunique')

More efficient way to group by and count values Pandas dataframe

Question

1 answers

solution1
0 2022-08-19 19:44:36

More efficient way to group by and count values Pandas dataframe

Question

1 answers

solution1 0 2022-08-19 19:44:36

solution1
0 2022-08-19 19:44:36