![](/img/trans.png)
[英]Create a boolean column in pandas datafame based on percentile values of another column
[英]Create a score column in pandas whose value depends on the percentile of another column
我有以下數據幀:
User_ID Game_ID votes
1 11 1040
1 11 nan
1 22 1101
1 11 540
1 33 nan
2 33 nan
2 33 290
2 33 nan
根據列votes
值的百分位數,需要根據以下規則創建新列:
如果“投票”值> = 75thcentntile,則得分為2
如果> = 25th百分位,則分數為1
如果<25thcentntile指定得分為0。
使用pd.qcut
:
df['score'] = pd.qcut(df['votes'].astype(float), [0, 0.25, 0.75, 1.0]).cat.codes
print(df)
輸出( nan
對應於-1
):
0 1
1 -1
2 2
3 1
4 -1
5 -1
6 0
7 -1
dtype: int8
您可以通過調用describe並使用list comprehension來獲取百分位數:
percentiles = df.votes.describe()
df['scores'] = [2 if x >= percentiles['75%'] else (0 if x < percentiles['25%'] else 1) for x in df.votes]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.