[英]Using Dplyr group_by and summarize with pandas
我正在尝试在 python 中使用 pandas'.groupby ZC1C425268E68385D1ABZ5079 创建一个单独的 pandas DataFrame。 我正在处理篮球数据,并希望创建一个列来显示主队和客队是否处于背靠背的尾端。
yesterday_home_team
和yesterday_away_team
列中的 0 表示客队前一天晚上没有比赛。
鉴于每晚有多个游戏,应该使用.groupby function。
输入数据:
date home_team away_team
9/22/22 LAL DET
9/23/22 LAC LAL
所需的 output:
date home_team away_team yesterday_home_team yesterday_away_team
9/21/22 LAL MIN 0 MIN
9/22/22 LAL DET DET 0
9/23/22 LAC LAL LAL LAC
感谢您的帮助。
您的 output 示例对我来说没有意义。 您需要'yesterday_home_team'
和'yesterday_away_team'
中的球队名称吗? 如果主队背靠背,则仅使用1
就足够了,如果主队没有,则仅使用0
就足够了(对于客队也是同样的逻辑)? 当您不提供良好的样本数据集时,这也很困难。
无论如何,这是我的解决方案,如果给定的团队位于背靠背的后端,则仅指示 1 或 0:
import pandas as pd
import numpy as np
months = ['October', 'November', 'December', 'January', 'February', 'March', 'April', 'May', 'June']
dfs = []
for month in months:
month = month.lower()
url = f'https://www.basketball-reference.com/leagues/NBA_2022_games-{month}.html'
df = pd.read_html(url)[0]
df['Date'] = pd.to_datetime(df['Date'])
dfs.append(df)
df = pd.concat(dfs)
df = df.rename(columns={'Visitor/Neutral':'away_team', 'Home/Neutral':'home_team'})
df_melt = pd.melt(df, id_vars=['Date'],
value_vars=['away_team', 'home_team'],
var_name = 'Home_Away',
value_name = 'Team')
df_melt = df_melt.sort_values('Date').reset_index(drop=True)
df_melt['days_between'] = df_melt.groupby('Team')['Date'].diff().dt.days
df_melt['yesterday'] = np.where(df_melt['days_between'] == 1, 1, 0)
df_melt = df_melt.drop(['days_between', 'Home_Away'], axis=1)
df = df.merge(df_melt.rename(columns={'Team':'home_team', 'yesterday':'yesterday_home_team'}), how='left', left_on=['Date', 'home_team'], right_on=['Date', 'home_team'])
df = df.merge(df_melt.rename(columns={'Team':'away_team', 'yesterday':'yesterday_away_team'}), how='left', left_on=['Date', 'away_team'], right_on=['Date', 'away_team'])
df = df[['Date', 'home_team', 'away_team', 'yesterday_home_team', 'yesterday_away_team']]
Output:
print(df.head(30).to_string())
Date home_team away_team yesterday_home_team yesterday_away_team
0 2021-10-19 Milwaukee Bucks Brooklyn Nets 0 0
1 2021-10-19 Los Angeles Lakers Golden State Warriors 0 0
2 2021-10-20 Charlotte Hornets Indiana Pacers 0 0
3 2021-10-20 Detroit Pistons Chicago Bulls 0 0
4 2021-10-20 New York Knicks Boston Celtics 0 0
5 2021-10-20 Toronto Raptors Washington Wizards 0 0
6 2021-10-20 Memphis Grizzlies Cleveland Cavaliers 0 0
7 2021-10-20 Minnesota Timberwolves Houston Rockets 0 0
8 2021-10-20 New Orleans Pelicans Philadelphia 76ers 0 0
9 2021-10-20 San Antonio Spurs Orlando Magic 0 0
10 2021-10-20 Utah Jazz Oklahoma City Thunder 0 0
11 2021-10-20 Portland Trail Blazers Sacramento Kings 0 0
12 2021-10-20 Phoenix Suns Denver Nuggets 0 0
13 2021-10-21 Atlanta Hawks Dallas Mavericks 0 0
14 2021-10-21 Miami Heat Milwaukee Bucks 0 0
15 2021-10-21 Golden State Warriors Los Angeles Clippers 0 0
16 2021-10-22 Orlando Magic New York Knicks 0 0
17 2021-10-22 Washington Wizards Indiana Pacers 0 0
18 2021-10-22 Cleveland Cavaliers Charlotte Hornets 0 0
19 2021-10-22 Boston Celtics Toronto Raptors 0 0
20 2021-10-22 Philadelphia 76ers Brooklyn Nets 0 0
21 2021-10-22 Houston Rockets Oklahoma City Thunder 0 0
22 2021-10-22 Chicago Bulls New Orleans Pelicans 0 0
23 2021-10-22 Denver Nuggets San Antonio Spurs 0 0
24 2021-10-22 Los Angeles Lakers Phoenix Suns 0 0
25 2021-10-22 Sacramento Kings Utah Jazz 0 0
26 2021-10-23 Cleveland Cavaliers Atlanta Hawks 1 0
27 2021-10-23 Indiana Pacers Miami Heat 1 0
28 2021-10-23 Toronto Raptors Dallas Mavericks 1 0
29 2021-10-23 Chicago Bulls Detroit Pistons 1 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.