[英]Conditional Counting on Pandas Groupby
我有一个 IPL 数据集,如下所示:
df.head(10):
toss_winner winner
0 Royal Challengers Bangalore Sunrisers Hyderabad
1 Rising Pune Supergiant Rising Pune Supergiant
2 Kolkata Knight Riders Kolkata Knight Riders
3 Kings XI Punjab Kings XI Punjab
4 Royal Challengers Bangalore Royal Challengers Bangalore
5 Sunrisers Hyderabad Sunrisers Hyderabad
6 Mumbai Indians Mumbai Indians
7 Royal Challengers Bangalore Kings XI Punjab
8 Rising Pune Supergiant Delhi Daredevils
9 Mumbai Indians Mumbai Indians
我想根据每支球队赢得掷球的次数以及赢得掷球后赢得比赛的次数来对我的数据进行分组。
例如,所需的 output 是:
team total_toss_win win_on_toss_win
Royal Challengers Bangalore 3 1
Rising Pune Supergiant 2 1
Kolkata Knight Riders 1 1
Kings XI Punjab 1 1 (although 2 wins, but lost the toss on second win)
and so on....
我尝试了 groupby 和聚合的变体,但似乎没有任何效果
尝试使用unstack
melt
然后groupby
s = pd.melt(df).groupby('value')['variable'].value_counts().unstack('variable')\
.fillna(0)
print(s)
variable toss_winner winner
value
Delhi Daredevils 0.0 1.0
Kings XI Punjab 1.0 2.0
Kolkata Knight Riders 1.0 1.0
Mumbai Indians 2.0 2.0
Rising Pune Supergiant 2.0 1.0
Royal Challengers Bangalore 3.0 1.0
Sunrisers Hyderabad 1.0 2.0
这是理解每个步骤的简单方法:
# number of counts each team win the toss
a = df.groupby("toss_winner").size()
# number of times they win the match after winning the toss
b = df.query("toss_winner == winner").groupby(["toss_winner"]).size()
# output
f = pd.concat([a, b], axis=1).reset_index().rename(columns={0: 'total_toss_win', 1: 'win_on_toss_win'})
print(f)
toss_winner total_toss_win win_on_toss_win
0 Kings XI Punjab 1 1
1 Kolkata Knight Riders 1 1
2 Mumbai Indians 2 2
3 Rising Pune Supergiant 2 1
4 Royal Challengers Bangalore 3 1
5 Sunrisers Hyderabad 1 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.