繁体   English   中英

如何创建显示三个选项之一的百分比的熊猫数据透视表?

[英]How can I create a pandas pivot table showing the percentage of one of three options?

我具有以下结构的体育数据:

season, country, league, hometeam, awayteam, status

状态可以是H,D或A

我想为每个赛季,国家和联赛创建一个数据透视表,将H表示为总比赛数的百分比。

import pandas as pd

grouped_success_rate = df_data.groupby(["Season", "Country", "League"])

homewins_per_league = grouped_success_rate.apply(lambda x: x[(x["STATUS"] == "H")].shape[0]).unstack("Season")
homewins_per_league.fillna(0, inplace=True)
homewins_per_league["TOTAL"] = homewins_per_league.apply(lambda x: sum(x), axis=1)

total_matches_per_league = grouped_success_rate.apply(lambda x: len(x["STATUS"])).unstack("Season")
total_matches_per_league.fillna(0, inplace=True)
total_matches_per_league["TOTAL"] = total_matches_per_league.apply(lambda x: sum(x), axis=1)

homewins_rate_per_league = (homewins_per_league / total_matches_per_league).applymap(lambda x: round(x, 3))

如您所见,我将数据框分组,但是随后我必须为特定选项和总计创建一个单独的数据框。 有没有一种方法可以创建两个数据框?

我认为最简单的方法是先向DataFrame添加一列:

df_data['is_H'] = df_data['Status'] == 'H'
df_data.groupby(["Season", "Country", "League"])['is_H'].mean()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM