簡體   English   中英

查找列表中所有元素的百分位數

[英]Find percentile of all elements in a list

例如,我有一個排序列表,

S = [0, 10.2, 345.9, ...]

如果 S 很大(500k+ 個元素),找到每個元素屬於哪個百分位的最佳方法是什么?

我的目標是存儲在看起來像這樣的數據庫表中:

Svalue | Percentile
-------------------
0      |     a
10.2.  |     b
345.9  |     c
...    |    ...

嘗試熊貓排名

import pandas as pd

df = pd.DataFrame()
df["Svalue"] = S
df["Percentile"] = df["Svalue"].rank(pct=True)

解決方案:

# Import and initialise pandas into session: 
import pandas as pd

# Store a scalar of the length of the list: list_length => list
list_length = len(S)

# Use a list comprehension to retrieve the indices of each element: idx => list
idx = [index for index, value in enumerate(S)]

# Divide each of the indices by the list_length scalar using a list 
# comprehension: percentile_rank => list
percentile_rank = [el / list_length for el in idx]

# Column bind separate lists into a single DataFrame in order to achieved desired format: df => pd.DataFrame
df = pd.DataFrame({"Svalue": S,  "Percentile": percentile_rank}) 

# Send the first 6 rows to console: stdout
df.head()

數據:

# Ensure list is sorted: S => list
S = sorted([0, 10.2, 345.9])

# Print the result: stdout
print(S)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM