简体   繁体   中英

Groupby bins on multiple items

My dataframe looks something like this (but with about 100,000 rows of data):

ID,Total,TotalDate,DaysBtwRead,Type,YearlyAvg 
1,1250,6/2/2017,17,AT267,229 
2,1670,2/3/2012,320,PQ43,50

I'm trying to groupby yearly average totals using

df.groupby(pd.cut(df['YearlyAvg'], np.arange(0,1250,50))).count()

so that I can set-up for a unique monte carlo distribution, but I need these grouped by each individual Type as well. This currently only counts each range regardless of any other values.

Rather than having an overall aggregate count, I'm trying to set my code up so that the output looks more like the following (with YearlyAvg containing a count of each range)

Index,YearlyAvg
AT267(0, 50], 200  
PQ43(0, 50], 123    
AT267(50, 100], 49   
PQ43(50, 100], 67

Is there an easier way to do this outside of creating a separate dataframe for each Type value?

You can using unstack with stack

df['bins']=pd.cut(df['YearlyAvg'], np.arange(0,1250,50))
df.groupby(['Type','bins']).size().unstack(fill_value=0).stack()# also here will create the multiple index for achieve what you need 
Out[1783]: 
Type   bins      
AT267  (0, 50]       0
       (200, 250]    1
PQ43   (0, 50]       1
       (200, 250]    0
dtype: int64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM