[英]Find percentile of all elements in a list
例如,我有一個排序列表,
S = [0, 10.2, 345.9, ...]
如果 S 很大(500k+ 個元素),找到每個元素屬於哪個百分位的最佳方法是什么?
我的目標是存儲在看起來像這樣的數據庫表中:
Svalue | Percentile
-------------------
0 | a
10.2. | b
345.9 | c
... | ...
嘗試熊貓排名
import pandas as pd
df = pd.DataFrame()
df["Svalue"] = S
df["Percentile"] = df["Svalue"].rank(pct=True)
解決方案:
# Import and initialise pandas into session:
import pandas as pd
# Store a scalar of the length of the list: list_length => list
list_length = len(S)
# Use a list comprehension to retrieve the indices of each element: idx => list
idx = [index for index, value in enumerate(S)]
# Divide each of the indices by the list_length scalar using a list
# comprehension: percentile_rank => list
percentile_rank = [el / list_length for el in idx]
# Column bind separate lists into a single DataFrame in order to achieved desired format: df => pd.DataFrame
df = pd.DataFrame({"Svalue": S, "Percentile": percentile_rank})
# Send the first 6 rows to console: stdout
df.head()
數據:
# Ensure list is sorted: S => list
S = sorted([0, 10.2, 345.9])
# Print the result: stdout
print(S)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.