Groupby bins on multiple items

Question

My dataframe looks something like this (but with about 100,000 rows of data):

ID,Total,TotalDate,DaysBtwRead,Type,YearlyAvg 
1,1250,6/2/2017,17,AT267,229 
2,1670,2/3/2012,320,PQ43,50

I'm trying to groupby yearly average totals using

df.groupby(pd.cut(df['YearlyAvg'], np.arange(0,1250,50))).count()

so that I can set-up for a unique monte carlo distribution, but I need these grouped by each individual Type as well. This currently only counts each range regardless of any other values.

Rather than having an overall aggregate count, I'm trying to set my code up so that the output looks more like the following (with YearlyAvg containing a count of each range)

Index,YearlyAvg
AT267(0, 50], 200  
PQ43(0, 50], 123    
AT267(50, 100], 49   
PQ43(50, 100], 67

Is there an easier way to do this outside of creating a separate dataframe for each Type value?

Answer 1

You can using unstack with stack

df['bins']=pd.cut(df['YearlyAvg'], np.arange(0,1250,50))
df.groupby(['Type','bins']).size().unstack(fill_value=0).stack()# also here will create the multiple index for achieve what you need 
Out[1783]: 
Type   bins      
AT267  (0, 50]       0
       (200, 250]    1
PQ43   (0, 50]       1
       (200, 250]    0
dtype: int64

Groupby bins on multiple items

Question

1 answers

solution1
2 ACCPTED 2018-04-09 15:23:13

Groupby bins on multiple items

Question

1 answers

solution1 2 ACCPTED 2018-04-09 15:23:13

solution1
2 ACCPTED 2018-04-09 15:23:13