简体   繁体   中英

Pandas qcut with groupby with non-unique values

I'm trying to do a groupby on a pandas dataframe and on that groupby do a qcut, to classify the values on a quantile. The problem is that some groups have only one value, so qcut complains with ValueError: Bin edges must be unique . Is there a way to simply ignore these cases on the groupby and qcut?

I'm doing something like

df['quantile'] = df.groupby(['grouping'])['values'].transform(
                 lambda x: pd.qcut(x, 4))

I can do this way on a two level grouping

pd.qcut(df.groupby(['grouping', 'param1']).sum()['value'],[0.15,0.25,0.5,0.75,1.0], labels=['0.15', '0.25', '0.5', '0.75'])

But I'm not sure that the results are the quantiles inside each group for the parameter grouping or for the entire dataframe.

within qcut method, you could set duplicates='drop'. This should produce a small number of null values for the qcut transformation that you could decide to impute to whatever value you wish.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM