简体   繁体   中英

Using Dplyr group_by and summarize with pandas

I am trying to create a separate pandas DataFrame in python using pandas'.groupby function. I am working with basketball data and want to create a column that displays if the home and away teams are on the tail end of a back-to-back.

The 0 in the yesterday_home_team and yesterday_away_team columns indicates that the away team did not play the previous night.

Given that there are multiple games each night, the.groupby function should be used.

Input Data:

date     home_team    away_team    
9/22/22  LAL          DET          
9/23/22  LAC          LAL         

Desired output:

date     home_team    away_team    yesterday_home_team    yesterday_away_team
9/21/22  LAL          MIN          0                      MIN 
9/22/22  LAL          DET          DET                    0
9/23/22  LAC          LAL          LAL                    LAC

Appreciate your assistance.

Your output example doesn't make sense to me. Do you need the team names in the 'yesterday_home_team' and 'yesterday_away_team' ? Is it sufficient to simply just have a 1 if the home team is on the back to back, and 0 if the home team is not (and then also same logic for away team)? It's also tough when you don't provide a good sample dataset.

Anyways, here's my solution that just indicates a 1 or 0 if the given team is on the back end of the back to back:

import pandas as pd
import numpy as np
months = ['October', 'November', 'December', 'January', 'February', 'March', 'April', 'May', 'June']

dfs = []
for month in months:
    month = month.lower()   
    url = f'https://www.basketball-reference.com/leagues/NBA_2022_games-{month}.html'
    df = pd.read_html(url)[0]
    df['Date'] = pd.to_datetime(df['Date'])
    dfs.append(df)
    
df = pd.concat(dfs)
df = df.rename(columns={'Visitor/Neutral':'away_team', 'Home/Neutral':'home_team'})

df_melt = pd.melt(df, id_vars=['Date'], 
        value_vars=['away_team', 'home_team'],
        var_name = 'Home_Away',
        value_name = 'Team')


df_melt = df_melt.sort_values('Date').reset_index(drop=True)
df_melt['days_between'] = df_melt.groupby('Team')['Date'].diff().dt.days
df_melt['yesterday'] = np.where(df_melt['days_between'] == 1, 1, 0)
df_melt = df_melt.drop(['days_between', 'Home_Away'], axis=1)


df = df.merge(df_melt.rename(columns={'Team':'home_team', 'yesterday':'yesterday_home_team'}), how='left', left_on=['Date', 'home_team'], right_on=['Date', 'home_team'])
df = df.merge(df_melt.rename(columns={'Team':'away_team', 'yesterday':'yesterday_away_team'}), how='left', left_on=['Date', 'away_team'], right_on=['Date', 'away_team'])


df = df[['Date', 'home_team', 'away_team', 'yesterday_home_team', 'yesterday_away_team']]

Output:

print(df.head(30).to_string())
         Date               home_team              away_team  yesterday_home_team  yesterday_away_team
0  2021-10-19         Milwaukee Bucks          Brooklyn Nets                    0                    0
1  2021-10-19      Los Angeles Lakers  Golden State Warriors                    0                    0
2  2021-10-20       Charlotte Hornets         Indiana Pacers                    0                    0
3  2021-10-20         Detroit Pistons          Chicago Bulls                    0                    0
4  2021-10-20         New York Knicks         Boston Celtics                    0                    0
5  2021-10-20         Toronto Raptors     Washington Wizards                    0                    0
6  2021-10-20       Memphis Grizzlies    Cleveland Cavaliers                    0                    0
7  2021-10-20  Minnesota Timberwolves        Houston Rockets                    0                    0
8  2021-10-20    New Orleans Pelicans     Philadelphia 76ers                    0                    0
9  2021-10-20       San Antonio Spurs          Orlando Magic                    0                    0
10 2021-10-20               Utah Jazz  Oklahoma City Thunder                    0                    0
11 2021-10-20  Portland Trail Blazers       Sacramento Kings                    0                    0
12 2021-10-20            Phoenix Suns         Denver Nuggets                    0                    0
13 2021-10-21           Atlanta Hawks       Dallas Mavericks                    0                    0
14 2021-10-21              Miami Heat        Milwaukee Bucks                    0                    0
15 2021-10-21   Golden State Warriors   Los Angeles Clippers                    0                    0
16 2021-10-22           Orlando Magic        New York Knicks                    0                    0
17 2021-10-22      Washington Wizards         Indiana Pacers                    0                    0
18 2021-10-22     Cleveland Cavaliers      Charlotte Hornets                    0                    0
19 2021-10-22          Boston Celtics        Toronto Raptors                    0                    0
20 2021-10-22      Philadelphia 76ers          Brooklyn Nets                    0                    0
21 2021-10-22         Houston Rockets  Oklahoma City Thunder                    0                    0
22 2021-10-22           Chicago Bulls   New Orleans Pelicans                    0                    0
23 2021-10-22          Denver Nuggets      San Antonio Spurs                    0                    0
24 2021-10-22      Los Angeles Lakers           Phoenix Suns                    0                    0
25 2021-10-22        Sacramento Kings              Utah Jazz                    0                    0
26 2021-10-23     Cleveland Cavaliers          Atlanta Hawks                    1                    0
27 2021-10-23          Indiana Pacers             Miami Heat                    1                    0
28 2021-10-23         Toronto Raptors       Dallas Mavericks                    1                    0
29 2021-10-23           Chicago Bulls        Detroit Pistons                    1                    0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM