How to calculate customized aggregations after group by pandas

Question

I have dataframe df like below

ID   COMMODITY_CODE   DELIVERY_TYPE  DAY   Window_start_time     case_qty     deliveries
6042.0      SCGR        Live         1.0    15:00                 15756.75    7.75
6042.0      SCGR        Live         1.0    18:00                 15787.75    5.75
6042.0      SCGR        Live         1.0    21:00                 10989.75    4.75
6042.0      SCGR        Live         2.0    15:00                 21025.25    9.00
6042.0      SCGR        Live         2.0    18:00                 16041.75    5.75

I want below output where i am grouping by ID, COMMODITY_CODE, DELIVERY_TYPE, DAY and Calculate below case_qty_ratio and dlvry_ratio like below

ID   COMMODITY_CODE   DELIVERY_TYPE  DAY  case_qty   deliveries dlvry_ratio case_qty_ratio
6042.0      SCGR        Live         1.0.  15756.75   7.75         0.42          0.37
6042.0      SCGR        Live         1.0.  15787.75   5.75.        0.31.         0.37
6042.0      SCGR        Live         1.0.  10989.75   4.75.        0.26.         0.25
6042.0      SCGR        Live         2.0.  21025.25   9.00.        0.61.         0.56
6042.0      SCGR        Live         2.0.  16041.75   5.75.        0.39          0.44

I tried below code using lambda function to aggregate this information

df.groupby(['ID','COMMODITY_CODE','DELIVERY_TYPE','DAY']  \
                        ,as_index=False) \
                        .agg( \
                             delivery_ratio=("deliveries",lambda x: x / x.sum()), \
                             case_ratio=(lambda x: x/ x.sum() ) /

But this didn't work. Any help would be appreciated

Answer 1

Try this way instead:

df[['case_ratio', 'delivery_ratio']] = df.groupby(['ID','COMMODITY_CODE','DELIVERY_TYPE','DAY'], 
                                                   as_index=False)[['case_qty', 'deliveries']]\
                                          .transform(lambda x: x/x.sum())

Output:

       ID COMMODITY_CODE DELIVERY_TYPE  DAY Window_start_time  case_qty  deliveries  case_ratio   delivery_ratio
0  6042.0           SCGR          Live  1.0             15:00  15756.75        7.75     0.370449        0.424658
1  6042.0           SCGR          Live  1.0             18:00  15787.75        5.75     0.371177        0.315068
2  6042.0           SCGR          Live  1.0             21:00  10989.75        4.75     0.258374        0.260274
3  6042.0           SCGR          Live  2.0             15:00  21025.25        9.00     0.567223        0.610169
4  6042.0           SCGR          Live  2.0             18:00  16041.75        5.75     0.432777        0.389831

Answer 2

Similar to Scott's answer, but just transform('sum') and then divide:

cols = ['case_qty', 'deliveries']
df = df.join(df[cols].div(df.groupby(['ID','COMMODITY_CODE','DELIVERY_TYPE','DAY'])
                            [cols].transform('sum')
                         )
                     .add_suffix('_ratio')
            )

Output:

       ID COMMODITY_CODE DELIVERY_TYPE  DAY Window_start_time  case_qty  \
0  6042.0           SCGR          Live  1.0             15:00  15756.75   
1  6042.0           SCGR          Live  1.0             18:00  15787.75   
2  6042.0           SCGR          Live  1.0             21:00  10989.75   
3  6042.0           SCGR          Live  2.0             15:00  21025.25   
4  6042.0           SCGR          Live  2.0             18:00  16041.75   

   deliveries  case_qty_ratio  deliveries_ratio  
0        7.75        0.370449          0.424658  
1        5.75        0.371177          0.315068  
2        4.75        0.258374          0.260274  
3        9.00        0.567223          0.610169  
4        5.75        0.432777          0.389831

How to calculate customized aggregations after group by pandas

Question

2 answers

solution1
2 2020-10-07 15:41:59

solution2
1 2020-10-07 15:47:44

How to calculate customized aggregations after group by pandas

Question

2 answers

solution1 2 2020-10-07 15:41:59

solution2 1 2020-10-07 15:47:44

solution1
2 2020-10-07 15:41:59

solution2
1 2020-10-07 15:47:44