在Pandas数据帧中确定值落入的分位数

Question

I have a pandas data frame with a few columns. 我有一个包含几列的pandas数据框。 For each column I want to calculate certain percentiles. 对于每列，我想计算某些百分位数。 I then want to replace my data frame with the percentile each observation falls in. 然后我想用每个观察值所在的百分位替换我的数据框。

import pandas as pd
M = np.random.uniform(0, 100, (10, 6))
df = pd.DataFrame(M, columns=['c%i'%i for i in range(6)])

>>> df[:2]
              c0         c1         c2         c3         c4         c5
    0  24.883165   2.299054  11.002427  98.711018  39.042343  50.408190
    1  42.099085  78.028507  25.099002  39.099628  38.687483  15.794404

df.quantile([.1, .5, .9])

                    c0         c1         c2         c3         c4         c5
        0.1  21.418274   7.094343  10.904711  25.014356  15.958873  21.984237
        0.5  41.793102  36.973471  29.031637  64.246471  41.136274  42.408574
        0.9  75.724554  62.274133  86.604768  93.690257  73.757992  89.365606

For example, in row 0, c0=24.883. 例如，在第0行中，c0 = 24.883。 The largest c0 quantile q_c0 where 24.883<=q_c0 would be 0.5. 最大的c0分位数q_c0，其中24.883 <= q_c0将是0.5。 In my new data frame I would then want to replace 24.883 with 0.5. 在我的新数据框架中，我想要用0.5替换24.883。

Answer 1

How about use qcut() : 如何使用qcut() ：

import pandas as pd
import numpy as np
M = np.random.uniform(0, 100, (10, 6))
df = pd.DataFrame(M, columns=['c%i'%i for i in range(6)])

bins = [0.0, 0.1, 0.5, 0.9, 1.0]
df.apply(lambda s:pd.qcut(s, bins, bins[1:]).astype(float))

在Pandas数据帧中确定值落入的分位数

问题描述

1 个解决方案

解决方案1
2 2015-03-03 00:43:02

在Pandas数据帧中确定值落入的分位数

问题描述

1 个解决方案

解决方案1 2 2015-03-03 00:43:02

解决方案1
2 2015-03-03 00:43:02