Im trying to use SK learn library and encode the strings so that i can run regression analysis and predict the winner but its giving me an error where the toss_winner encoding (see the image attached where toss winner is coded as 12 where competing teams are coded as 6 and 11 Output Code )
Im using a public IPL dataset and a newbie in data science so need your help and appreciate simple answers to explain:)
Code used:
from sklearn import preprocessing
encoder= preprocessing.LabelEncoder()
matchdf["Team1"]=encoder.fit_transform(matchdf["Team1"])
matchdf["Team2"]=encoder.fit_transform(matchdf["Team2"])
matchdf["match_winner"]=encoder.fit_transform(matchdf["match_winner"])
matchdf["Toss_Winner"]=encoder.fit_transform(matchdf["Toss_Winner"])
the intent is then to find the relation to the team 1 and team2 in other columns as below code and then Building, Training & Testing the Model
matchdf.loc[matchdf["match_winner"]==matchdf["Team1"],"Team1_winning"]=1
matchdf.loc[matchdf["match_winner"]!=matchdf["Team1"],"Team1_winning"]=0
#outcome variable team1_toss_win as a value of team1 winning the toss
matchdf.loc[matchdf["Toss_Winner"]==matchdf["Team1"],"Team1_toss_winning"]=1
matchdf.loc[matchdf["Toss_Winner"]!=matchdf["Team1"],"Team1_toss_winning"]=0
I don't understand very well your use of the fit_transform
method of LabelEncoder
as I thought that each fit
would erase previously memorized labels. I can't say if its a bug or what.. Maybe your input has already the problem you exhibit, that is the match winner is already not in the list of participants? Maybe it is a ill formated string (with trailing spaces or something?)
So I propose to instead first fit the LabelEncoder
with all possible labels then transform the columns:
from sklearn import preprocessing
encoder= preprocessing.LabelEncoder()
team_values = matchdf[["Team1", "Team2"]].values.ravel()
unique_team_values = pd.unique(team_values)
encoder.fit(team_values)
matchdf["Team1"]=encoder.transform(matchdf["Team1"].values)
matchdf["Team2"]=encoder.transform(matchdf["Team2"].values)
matchdf["match_winner"]=encoder.transform(matchdf["match_winner"].values)
matchdf["Toss_Winner"]=encoder.transform(matchdf["Toss_Winner"].values)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.