简体   繁体   中英

Round off to decimal places within aggregate of groupby pandas python

df

order_date    Month Name   Year   Days  Data
2015-12-20     Dec         2014    1     3
2016-1-21      Jan         2014    2     3
2015-08-20     Aug         2015    1     1 
2016-04-12     Apr         2016    4     1

and so on

Code: (finding mean, min, and median of days column and finding number of order_dates month wise for each respective year)

df1 = (df.groupby(["Year", "Month Name"])
     .agg(Min_days=("days", 'min'),
          Avg_days=("days", 'mean'),
          Median_days=('days','median'),
          Count = ('order_date', 'count'))
     .reset_index())

df1

   Year Month Name  Min_days    Avg_days    Median_days     Count
    2015   Jan       9        12.56666666          10         4
    2015   Feb       10       13.67678788          9          3    
   ........................................................
    2016   Jan       12       15.7889990           19          2
    and so on...

Issue at hand:

I am getting mean values Avg_Days column with more than 5 decimal places. I want to round off the values of means to 2 decimal places. How can I do that within the code?

Just add .round(2) after the reset_index() . He will round all float columns

df1 = (df.groupby(["Year", "Month Name"])
     .agg(Min_days=("Days", 'min'),
          Avg_days=("Days", 'mean'),
          Median_days=('Days','median'),
          Count = ('order_date', 'count'))
     .reset_index().round(2))

It is possible by custom function:

def round_mean(x):
    return round(x.mean(), 2)

df1 = (df.groupby(["Year", "Month Name"])
     .agg(Min_days=("Days", 'min'),
          Avg_days=("Days", round_mean),
          Median_days=('Days','median'),
          Count = ('order_date', 'count'))
     .reset_index())

print (df1)
   Year Month Name  Min_days  Avg_days  Median_days  Count
0  2014        Dec         1         1            1      1
1  2014        Jan         2         2            2      1
2  2015        Aug         1         1            1      1
3  2016        Apr         4         4            4      1

Unfortunately lambda function failed yet:

df1 = (df.groupby(["Year", "Month Name"])
     .agg(Min_days=("Days", 'min'),
          Avg_days=("Days", lambda x: round(x.mean(), 2)),
          Median_days=('Days','median'),
          Count = ('order_date', 'count'))
     .reset_index())

KeyError: "[('Days', '')] not in index"

But simplier is round values after:

df1 = (df.groupby(["Year", "Month Name"])
     .agg(Min_days=("Days", 'min'),
          Avg_days=("Days", 'mean'),
          Median_days=('Days','median'),
          Count = ('order_date', 'count'))
     .reset_index())

df1['Avg_days'] = df1['Avg_days'].round(2) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM