I have the following dataframe:
User_ID Game_ID votes
1 11 1040
1 11 nan
1 22 1101
1 11 540
1 33 nan
2 33 nan
2 33 290
2 33 nan
Based on the percentile of the values in the column votes
, a new column needs to be created, per the following rules:
If the “votes” value is >= 75th percentile assign a score of 2
If >=25th percentile assign a score of 1
If <25th percentile assign a score of 0.
Use pd.qcut
:
df['score'] = pd.qcut(df['votes'].astype(float), [0, 0.25, 0.75, 1.0]).cat.codes
print(df)
Output ( nan
corresponds to -1
):
0 1
1 -1
2 2
3 1
4 -1
5 -1
6 0
7 -1
dtype: int8
You can get the percentiles by calling describe and use list comprehension:
percentiles = df.votes.describe()
df['scores'] = [2 if x >= percentiles['75%'] else (0 if x < percentiles['25%'] else 1) for x in df.votes]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.