I'm trying to calculate the win-streak or losing-streak going into a game. My goal is to generate a betting decision based on these streak factors or a recent record. I am new to Python and Pandas (and programming in general), so any detailed explanation of what code does would be welcome.
Here's my data
Season Game Date Game Index Away Team Away Score Home Team Home Score Winner Loser
0 2014 Regular Season Saturday, March 22, 2014 2014032201 Los Angeles Dodgers 3 Arizona D'Backs 1 Los Angeles Dodgers Arizona D'Backs
1 2014 Regular Season Sunday, March 23, 2014 2014032301 Los Angeles Dodgers 7 Arizona D'Backs 5 Los Angeles Dodgers Arizona D'Backs
2 2014 Regular Season Sunday, March 30, 2014 2014033001 Los Angeles Dodgers 1 San Diego Padres 3 San Diego Padres Los Angeles Dodgers
3 2014 Regular Season Monday, March 31, 2014 2014033101 Seattle Mariners 10 Los Angeles Angels 3 Seattle Mariners Los Angeles Angels
4 2014 Regular Season Monday, March 31, 2014 2014033102 San Francisco Giants 9 Arizona D'Backs 8 San Francisco Giants Arizona D'Backs
5 2014 Regular Season Monday, March 31, 2014 2014033103 Boston Red Sox 1 Baltimore Orioles 2 Baltimore Orioles Boston Red Sox
6 2014 Regular Season Monday, March 31, 2014 2014033104 Minnesota Twins 3 Chicago White Sox 5 Chicago White Sox Minnesota Twins
7 2014 Regular Season Monday, March 31, 2014 2014033105 St. Louis Cardinals 1 Cincinnati Reds 0 St. Louis Cardinals Cincinnati Reds
8 2014 Regular Season Monday, March 31, 2014 2014033106 Kansas City Royals 3 Detroit Tigers 4 Detroit Tigers Kansas City Royals
9 2014 Regular Season Monday, March 31, 2014 2014033107 Colorado Rockies 1 Miami Marlins 10 Miami Marlins Colorado Rockies
Dictionary below:
{'Away Score': {0: 3, 1: 7, 2: 1, 3: 10, 4: 9},
'Away Team': {0: 'Los Angeles Dodgers',
1: 'Los Angeles Dodgers',
2: 'Los Angeles Dodgers',
3: 'Seattle Mariners',
4: 'San Francisco Giants'},
'Game Date': {0: 'Saturday, March 22, 2014',
1: 'Sunday, March 23, 2014',
2: 'Sunday, March 30, 2014',
3: 'Monday, March 31, 2014',
4: 'Monday, March 31, 2014'},
'Game Index': {0: 2014032201,
1: 2014032301,
2: 2014033001,
3: 2014033101,
4: 2014033102},
'Home Score': {0: 1, 1: 5, 2: 3, 3: 3, 4: 8},
'Home Team': {0: "Arizona D'Backs",
1: "Arizona D'Backs",
2: 'San Diego Padres',
3: 'Los Angeles Angels',
4: "Arizona D'Backs"},
'Loser': {0: "Arizona D'Backs",
1: "Arizona D'Backs",
2: 'Los Angeles Dodgers',
3: 'Los Angeles Angels',
4: "Arizona D'Backs"},
'Season': {0: '2014 Regular Season',
1: '2014 Regular Season',
2: '2014 Regular Season',
3: '2014 Regular Season',
4: '2014 Regular Season'},
'Winner': {0: 'Los Angeles Dodgers',
1: 'Los Angeles Dodgers',
2: 'San Diego Padres',
3: 'Seattle Mariners',
4: 'San Francisco Giants'}}
I've tried looping through the season and the team, and then creating a streak count based on [this]: https://github.com/nhcamp/EPL-Betting/blob/master/EPL%20Match%20Results%20DF.ipynb github project.
I run into key errors early in building my loops, and I have trouble identifying data
game_table = pd.read_csv('MLB_Scores_2014_2018.csv')
# Get Team List
team_list = game_table['Away Team'].unique()
# Get Season List
season_list = game_table['Season'].unique()
#Defining "chunks" to append gamedata to the total dataframe
chunks = []
for season in season_list:
# Looping through seasons. Streaks reset for each season
season_games = game_table[game_table['Season'] == season]
for team in team_list:
# Looping through teams
season_team_games = season_games[(season_games['Away Team'] == team | season_games['Home Team'] == team)]
#Setting streak list and streak counter values
streak_list = []
streak = 0
# Looping through each game
for game in season_team_games.iterrow():
# Check if team is a winner, and up the streak
if game_table['Winner'] == team:
streak_list.append(streak)
streak += 1
# If not the winner, append streak and set to zero
elif game_table['Winner'] != team:
streak_list.append(streak)
streak = 0
# Just in case something wierd happens with the scores
else:
streak_list.append(streak)
game_table['Streak'] = streak_list
chunk_list.append(game_table)
And that's kind of where I lose it. How do I append separately if each team is the home team or the away team? Is there a better way to display this data?
As a general matter, I want to add a win-streak and/or losing-streak for each team in each game. Headers would look like this:
| Season | Game Date | Game Index | Away Team | Away Score | Home Team | Home Score | Winner | Loser | Away Win Streak | Away Lose Streak | Home Win Streak | Home Lose Streak |
Edit: this error message has been resolved
I also get an error creating the dataframe 'season_team_games."
TypeError: cannot compare a dtyped [object] array with a scalar of type [bool]
The error you are seeing come from the statement
season_team_games = season_games[(season_games['Away Team'] == team | season_games['Home Team'] == team)]
When you're adding two boolean conditions, you need to separate them out with parentheses. This is because the |
operator takes precedence over the ==
operator. So this should become:
season_team_games = season_games[(season_games['Away Team'] == team) | (season_games['Home Team'] == team)]
I know there is more to the question than this error, but as mentioned in the comment, once you provide some text based data, it might be easier to help
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.