[英]How to find top 3 values in amount, based on another column by using pandas
首先将DataFrame.explode
用于列表到标量,然后通过DataFrame.drop_duplicates
删除重复DataFrame.drop_duplicates
,最后通过Series.value_counts
和Series.head
获得 top3,因为value_counts
默认排序:
top3 = df.explode('Games').drop_duplicates(['Games','Room'])['Games'].value_counts().head(3)
Explode your Games
columns (if Games
contains real Python list) then drop duplicates (according your side notes) and use value_counts
with different parameters according to what you want:展开您的Games
列(如果Games
包含真正的 Python 列表)然后删除重复项(根据您的附注)并根据您的需要使用具有不同参数的value_counts
:
>>> df.explode('Games') \
.drop_duplicates(['Games', 'Rooms']) \
.value_counts('Games').head(3)
Games
A 2
B 2
C 2
dtype: int64
>>> df.explode('Games') \
.drop_duplicates(['Games', 'Rooms']) \
.value_counts(['Games', 'Rooms']).head(3)
Games Rooms
A North 1
West 1
B East 1
dtype: int64
Setup:设置:
data = {'Games': [['A', 'B', 'C'], ['B', 'D'], ['B', 'E'], ['A', 'C'], ['D']],
'Rooms': ['West', 'East', 'East', 'North', 'South']}
df = pd.DataFrame(data)
print(df)
# Output:
Games Rooms
0 [A, B, C] West
1 [B, D] East
2 [B, E] East
3 [A, C] North
4 [D] South
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.