I am trying to create a separate pandas DataFrame in python using pandas'.groupby function. I am working with basketball data and want to create a column that displays if the home and away teams are on the tail end of a back-to-back.
The 0 in the yesterday_home_team
and yesterday_away_team
columns indicates that the away team did not play the previous night.
Given that there are multiple games each night, the.groupby function should be used.
Input Data:
date home_team away_team
9/22/22 LAL DET
9/23/22 LAC LAL
Desired output:
date home_team away_team yesterday_home_team yesterday_away_team
9/21/22 LAL MIN 0 MIN
9/22/22 LAL DET DET 0
9/23/22 LAC LAL LAL LAC
Appreciate your assistance.
Your output example doesn't make sense to me. Do you need the team names in the 'yesterday_home_team'
and 'yesterday_away_team'
? Is it sufficient to simply just have a 1
if the home team is on the back to back, and 0
if the home team is not (and then also same logic for away team)? It's also tough when you don't provide a good sample dataset.
Anyways, here's my solution that just indicates a 1 or 0 if the given team is on the back end of the back to back:
import pandas as pd
import numpy as np
months = ['October', 'November', 'December', 'January', 'February', 'March', 'April', 'May', 'June']
dfs = []
for month in months:
month = month.lower()
url = f'https://www.basketball-reference.com/leagues/NBA_2022_games-{month}.html'
df = pd.read_html(url)[0]
df['Date'] = pd.to_datetime(df['Date'])
dfs.append(df)
df = pd.concat(dfs)
df = df.rename(columns={'Visitor/Neutral':'away_team', 'Home/Neutral':'home_team'})
df_melt = pd.melt(df, id_vars=['Date'],
value_vars=['away_team', 'home_team'],
var_name = 'Home_Away',
value_name = 'Team')
df_melt = df_melt.sort_values('Date').reset_index(drop=True)
df_melt['days_between'] = df_melt.groupby('Team')['Date'].diff().dt.days
df_melt['yesterday'] = np.where(df_melt['days_between'] == 1, 1, 0)
df_melt = df_melt.drop(['days_between', 'Home_Away'], axis=1)
df = df.merge(df_melt.rename(columns={'Team':'home_team', 'yesterday':'yesterday_home_team'}), how='left', left_on=['Date', 'home_team'], right_on=['Date', 'home_team'])
df = df.merge(df_melt.rename(columns={'Team':'away_team', 'yesterday':'yesterday_away_team'}), how='left', left_on=['Date', 'away_team'], right_on=['Date', 'away_team'])
df = df[['Date', 'home_team', 'away_team', 'yesterday_home_team', 'yesterday_away_team']]
Output:
print(df.head(30).to_string())
Date home_team away_team yesterday_home_team yesterday_away_team
0 2021-10-19 Milwaukee Bucks Brooklyn Nets 0 0
1 2021-10-19 Los Angeles Lakers Golden State Warriors 0 0
2 2021-10-20 Charlotte Hornets Indiana Pacers 0 0
3 2021-10-20 Detroit Pistons Chicago Bulls 0 0
4 2021-10-20 New York Knicks Boston Celtics 0 0
5 2021-10-20 Toronto Raptors Washington Wizards 0 0
6 2021-10-20 Memphis Grizzlies Cleveland Cavaliers 0 0
7 2021-10-20 Minnesota Timberwolves Houston Rockets 0 0
8 2021-10-20 New Orleans Pelicans Philadelphia 76ers 0 0
9 2021-10-20 San Antonio Spurs Orlando Magic 0 0
10 2021-10-20 Utah Jazz Oklahoma City Thunder 0 0
11 2021-10-20 Portland Trail Blazers Sacramento Kings 0 0
12 2021-10-20 Phoenix Suns Denver Nuggets 0 0
13 2021-10-21 Atlanta Hawks Dallas Mavericks 0 0
14 2021-10-21 Miami Heat Milwaukee Bucks 0 0
15 2021-10-21 Golden State Warriors Los Angeles Clippers 0 0
16 2021-10-22 Orlando Magic New York Knicks 0 0
17 2021-10-22 Washington Wizards Indiana Pacers 0 0
18 2021-10-22 Cleveland Cavaliers Charlotte Hornets 0 0
19 2021-10-22 Boston Celtics Toronto Raptors 0 0
20 2021-10-22 Philadelphia 76ers Brooklyn Nets 0 0
21 2021-10-22 Houston Rockets Oklahoma City Thunder 0 0
22 2021-10-22 Chicago Bulls New Orleans Pelicans 0 0
23 2021-10-22 Denver Nuggets San Antonio Spurs 0 0
24 2021-10-22 Los Angeles Lakers Phoenix Suns 0 0
25 2021-10-22 Sacramento Kings Utah Jazz 0 0
26 2021-10-23 Cleveland Cavaliers Atlanta Hawks 1 0
27 2021-10-23 Indiana Pacers Miami Heat 1 0
28 2021-10-23 Toronto Raptors Dallas Mavericks 1 0
29 2021-10-23 Chicago Bulls Detroit Pistons 1 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.