简体   繁体   中英

pandas- calculate percentile (quantile) of grouped columns

My dataframe looks like

lang score
en    0.7
fr    0.4
en    0.3
...
it    0.7
fr    0.2
de    0.5
...

I want to get the percentile (Pandas quantile ) of the score col grouped by the lang col, so I calculate mean, median and percentile as follows:

mean = df.groupby('lang')['score'].mean().sort_values(ascending=False)
median = df.groupby('lang')['score'].median().sort_values(ascending=False)
perc = df.groupby('lang')['score'].quantile(np.linspace(.1, 1, 9, 0))

While mean and median are correct, I get a NaN for the quantile col:

fr                       0.1                    NaN
                         0.2                    NaN
                         0.3                    NaN
                         0.4                    NaN
                         0.5                    NaN
...                                             ...
en                       0.5                    NaN
                         0.6                    NaN
                         0.7                    NaN
                         0.8                    NaN
                         0.9                    NaN

Where is the error?

Could be you have NaNs in your dataframe?

Try executing this before the perc computation:

df.dropna(subset=['score'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM