繁体   English   中英

Pandas Groupby 上的条件计数

[英]Conditional Counting on Pandas Groupby

我有一个 IPL 数据集,如下所示:

df.head(10):        

                  toss_winner                       winner
0    Royal Challengers Bangalore          Sunrisers Hyderabad
1         Rising Pune Supergiant       Rising Pune Supergiant
2          Kolkata Knight Riders        Kolkata Knight Riders
3                Kings XI Punjab              Kings XI Punjab
4    Royal Challengers Bangalore  Royal Challengers Bangalore
5          Sunrisers Hyderabad          Sunrisers Hyderabad
6               Mumbai Indians               Mumbai Indians
7  Royal Challengers Bangalore              Kings XI Punjab
8       Rising Pune Supergiant             Delhi Daredevils
9               Mumbai Indians               Mumbai Indians

我想根据每支球队赢得掷球的次数以及赢得掷球后赢得比赛的次数来对我的数据进行分组。

例如,所需的 output 是:

team                    total_toss_win                     win_on_toss_win
Royal Challengers Bangalore   3                                     1
Rising Pune Supergiant        2                                     1 
Kolkata Knight Riders         1                                     1
Kings XI Punjab               1                                     1  (although 2 wins, but lost the toss on second win)
and so on....

我尝试了 groupby 和聚合的变体,但似乎没有任何效果

尝试使用unstack melt然后groupby

s = pd.melt(df).groupby('value')['variable'].value_counts().unstack('variable')\
                .fillna(0)

print(s)

variable                     toss_winner  winner
value                                           
Delhi Daredevils                     0.0     1.0
Kings XI Punjab                      1.0     2.0
Kolkata Knight Riders                1.0     1.0
Mumbai Indians                       2.0     2.0
Rising Pune Supergiant               2.0     1.0
Royal Challengers Bangalore          3.0     1.0
Sunrisers Hyderabad                  1.0     2.0

这是理解每个步骤的简单方法:

# number of counts each team win the toss
a = df.groupby("toss_winner").size()

# number of times they win the match after winning the toss
b = df.query("toss_winner == winner").groupby(["toss_winner"]).size()

# output
f = pd.concat([a, b], axis=1).reset_index().rename(columns={0: 'total_toss_win', 1: 'win_on_toss_win'})

print(f)

                   toss_winner  total_toss_win  win_on_toss_win
0              Kings XI Punjab               1                1
1        Kolkata Knight Riders               1                1
2               Mumbai Indians               2                2
3       Rising Pune Supergiant               2                1
4  Royal Challengers Bangalore               3                1
5          Sunrisers Hyderabad               1                1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM