简体   繁体   English

如何使用 pandas 获取数据框中高于某个百分位数的所有值?

[英]How can I get all values above a certain percentile in a data frame using pandas?

I can get the value of 75% using the quantile function in pandas, but how can I get all the values from 75% to 100% of each column in a data frame?我可以使用 pandas 中的分位数 function 获取 75% 的值,但是如何获取数据框中每列从 75% 到 100% 的所有值?

I tried this at the beginning to get the 75 percentile and the mean of that我在开始时尝试过这个以获得 75 个百分位数及其平均值

n = df.quantile(0.75)
x = df.mean(n)

Then I tried a for loop but did not quite work because I cannot specify the loop to go between rows in each column (and do this for all the columns)然后我尝试了一个 for 循环但没有完全起作用,因为我无法在每列的行之间指定到 go 的循环(并对所有列执行此操作)

n = df.quantile(0.75)

for i in n.index:
    if i >= n:
        print(n)

The expected output is unclear, but assuming you want a list/Series of those values.预期的 output 不清楚,但假设您想要这些值的列表/系列。

Let's start with a dummy example:让我们从一个虚拟示例开始:

np.random.seed(0)
df = pd.DataFrame(np.random.randint(0, 100, size=(20, 10)))

And get the quantile(0.75) per column:并获得每列的分位数(0.75):

df.quantile(0.75)

0    75.50
1    67.50
2    72.75
3    78.25
4    68.25
5    77.00
6    67.50
7    77.75
8    57.50
9    74.50
Name: 0.75, dtype: float64

We can then use:然后我们可以使用:

df.where(df.gt(df.quantile(0.75))).stack().droplevel(0).sort_index()

0    77.0
0    94.0
0    78.0
0    81.0
0    80.0
1    88.0
1    69.0
1    95.0
1    84.0
1    79.0
2    82.0
2    75.0
2    80.0
2    88.0
2    82.0
...

Or as list:或者作为列表:

df.where(df.gt(df.quantile(0.75))).stack().groupby(level=1).agg(list)

0    [81.0, 78.0, 94.0, 80.0, 77.0]
1    [88.0, 79.0, 84.0, 69.0, 95.0]
2    [88.0, 82.0, 75.0, 82.0, 80.0]
3    [99.0, 82.0, 99.0, 93.0, 86.0]
4    [72.0, 88.0, 91.0, 99.0, 77.0]
5    [95.0, 98.0, 86.0, 94.0, 83.0]
6    [83.0, 79.0, 69.0, 81.0, 98.0]
7    [87.0, 80.0, 99.0, 94.0, 98.0]
8    [69.0, 76.0, 85.0, 62.0, 70.0]
9    [87.0, 88.0, 79.0, 96.0, 85.0]
dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM