I am trying to do iterate over a multi-index with the following DataFrame.
Essentially, what I am trying to do is reduce the DataFrame to the top QB, top 2 RB's, top 3 WR's, and top TE based off their values in their respective "FantasyPoints" column for each NFL team. I have been trying to figure out for hours how to do this, but can't come up with a solution. I tried using groupby but no luck, and figured I may have to iterate over the multi-index but haven't figured that out either. Thanks in advance to anyone who can help me figure this out. Below is the code used to generate the DataFrame in its existing state. Here is a link to the CSV file that is being used. https://drive.google.com/file/d/1hX1Jmjk4RBxsH8tt8g1tqwKqrjZkFZp_/view?usp=sharing
#import our CSV file
df = pd.read_csv('2019.csv')
#drop unneccessary columns
df.drop(['Rk', '2PM', '2PP', 'FantPt', 'DKPt', 'FDPt',
'VBD', 'PosRank', 'OvRank', 'PPR', 'Fmb',
'GS', 'Age', 'Tgt', 'Y/A', 'Att', 'Att.1', 'Cmp', 'Y/R'], axis=1, inplace=True)
#fix name formatting
df['Player'] = df['Player'].apply(lambda x: x.split('*')[0]).apply(lambda x: x.split('\\')[0])
#rename columns
df.rename({
'TD': 'PassingTD',
'TD.1': 'RushingTD',
'TD.2': 'ReceivingTD',
'TD.3': 'TotalTD',
'Yds': 'PassingYDs',
'Yds.1': 'RushingYDs',
'Yds.2': 'ReceivingYDs',
}, axis=1, inplace=True)
df['FantasyPoints'] = (df['PassingYDs']*0.04 + df['PassingTD']*4 - df['Int']*2 + df['RushingYDs']*.1
+ df['RushingTD']*6 + df['Rec']*1 + df['ReceivingYDs']*.1 + df['ReceivingTD']*6 - df['FL']*2)
df = df[['Tm', 'FantPos', 'FantasyPoints']]
df = df[df['Tm'] != '2TM']
df = df[df['Tm'] != '3TM']
df.set_index(['Tm', 'FantPos'], inplace=True)
df = df.sort_index()
df.head(30)
Why do multi-index?? You can easily set up a dictionary to iterate through and grab the top n rows for each condition/position:
import pandas as pd
#import our CSV file
df = pd.read_csv('2019.csv')
#drop unneccessary columns
df.drop(['Rk', '2PM', '2PP', 'FantPt', 'DKPt', 'FDPt',
'VBD', 'PosRank', 'OvRank', 'PPR', 'Fmb',
'GS', 'Age', 'Tgt', 'Y/A', 'Att', 'Att.1', 'Cmp', 'Y/R'], axis=1, inplace=True)
#fix name formatting
df['Player'] = df['Player'].apply(lambda x: x.split('*')[0]).apply(lambda x: x.split('\\')[0])
#rename columns
df.rename({
'TD': 'PassingTD',
'TD.1': 'RushingTD',
'TD.2': 'ReceivingTD',
'TD.3': 'TotalTD',
'Yds': 'PassingYDs',
'Yds.1': 'RushingYDs',
'Yds.2': 'ReceivingYDs',
}, axis=1, inplace=True)
df['FantasyPoints'] = (df['PassingYDs']*0.04 + df['PassingTD']*4 - df['Int']*2 + df['RushingYDs']*.1
+ df['RushingTD']*6 + df['Rec']*1 + df['ReceivingYDs']*.1 + df['ReceivingTD']*6 - df['FL']*2)
df = df[['Tm', 'FantPos', 'FantasyPoints']]
df = df[df['Tm'] != '2TM']
df = df[df['Tm'] != '3TM']
dictionary = {'QB':1,'RB':2,'WR':3,'TE':1}
results_df = pd.DataFrame()
for pos, n in dictionary.items():
results_df = results_df.append(df[df['FantPos'] == pos].nlargest(n, columns='FantasyPoints'), sort=True).reset_index(drop=True)
Output:
print (results_df)
FantPos FantasyPoints Tm
0 QB 415.68 BAL
1 RB 469.20 CAR
2 RB 314.80 GNB
3 WR 374.60 NOR
4 WR 274.10 TAM
5 WR 274.10 ATL
6 TE 254.30 KAN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.