How to iterate over pandas DataFrame multi-index and filtering based off another column value

Question

I am trying to do iterate over a multi-index with the following DataFrame.

Essentially, what I am trying to do is reduce the DataFrame to the top QB, top 2 RB's, top 3 WR's, and top TE based off their values in their respective "FantasyPoints" column for each NFL team. I have been trying to figure out for hours how to do this, but can't come up with a solution. I tried using groupby but no luck, and figured I may have to iterate over the multi-index but haven't figured that out either. Thanks in advance to anyone who can help me figure this out. Below is the code used to generate the DataFrame in its existing state. Here is a link to the CSV file that is being used. https://drive.google.com/file/d/1hX1Jmjk4RBxsH8tt8g1tqwKqrjZkFZp_/view?usp=sharing

#import our CSV file
df = pd.read_csv('2019.csv')

#drop unneccessary columns
df.drop(['Rk', '2PM', '2PP', 'FantPt', 'DKPt', 'FDPt', 
         'VBD', 'PosRank', 'OvRank', 'PPR', 'Fmb', 
         'GS', 'Age', 'Tgt', 'Y/A', 'Att', 'Att.1', 'Cmp', 'Y/R'], axis=1, inplace=True)

#fix name formatting
df['Player'] = df['Player'].apply(lambda x: x.split('*')[0]).apply(lambda x: x.split('\\')[0])

#rename columns
df.rename({
    'TD': 'PassingTD',
    'TD.1': 'RushingTD',
    'TD.2': 'ReceivingTD',
    'TD.3': 'TotalTD',
    'Yds': 'PassingYDs',
    'Yds.1': 'RushingYDs',
    'Yds.2': 'ReceivingYDs',
}, axis=1, inplace=True)

df['FantasyPoints'] = (df['PassingYDs']*0.04 + df['PassingTD']*4 - df['Int']*2 + df['RushingYDs']*.1 
                       + df['RushingTD']*6 + df['Rec']*1 + df['ReceivingYDs']*.1 + df['ReceivingTD']*6 - df['FL']*2)

df = df[['Tm', 'FantPos', 'FantasyPoints']]

df = df[df['Tm'] != '2TM']
df = df[df['Tm'] != '3TM']

df.set_index(['Tm', 'FantPos'], inplace=True)
df = df.sort_index()
df.head(30)

Answer 1

Why do multi-index?? You can easily set up a dictionary to iterate through and grab the top n rows for each condition/position:

import pandas as pd

#import our CSV file
df = pd.read_csv('2019.csv')

#drop unneccessary columns
df.drop(['Rk', '2PM', '2PP', 'FantPt', 'DKPt', 'FDPt', 
         'VBD', 'PosRank', 'OvRank', 'PPR', 'Fmb', 
         'GS', 'Age', 'Tgt', 'Y/A', 'Att', 'Att.1', 'Cmp', 'Y/R'], axis=1, inplace=True)

#fix name formatting
df['Player'] = df['Player'].apply(lambda x: x.split('*')[0]).apply(lambda x: x.split('\\')[0])

#rename columns
df.rename({
    'TD': 'PassingTD',
    'TD.1': 'RushingTD',
    'TD.2': 'ReceivingTD',
    'TD.3': 'TotalTD',
    'Yds': 'PassingYDs',
    'Yds.1': 'RushingYDs',
    'Yds.2': 'ReceivingYDs',
}, axis=1, inplace=True)

df['FantasyPoints'] = (df['PassingYDs']*0.04 + df['PassingTD']*4 - df['Int']*2 + df['RushingYDs']*.1 
                       + df['RushingTD']*6 + df['Rec']*1 + df['ReceivingYDs']*.1 + df['ReceivingTD']*6 - df['FL']*2)

df = df[['Tm', 'FantPos', 'FantasyPoints']]

df = df[df['Tm'] != '2TM']
df = df[df['Tm'] != '3TM']


dictionary = {'QB':1,'RB':2,'WR':3,'TE':1}
results_df = pd.DataFrame()
for pos, n in dictionary.items():
    results_df = results_df.append(df[df['FantPos'] == pos].nlargest(n, columns='FantasyPoints'), sort=True).reset_index(drop=True)

Output:

print (results_df)

  FantPos  FantasyPoints   Tm
0      QB         415.68  BAL
1      RB         469.20  CAR
2      RB         314.80  GNB
3      WR         374.60  NOR
4      WR         274.10  TAM
5      WR         274.10  ATL
6      TE         254.30  KAN

How to iterate over pandas DataFrame multi-index and filtering based off another column value

Question

1 answers

solution1
0 2020-01-16 16:53:58

How to iterate over pandas DataFrame multi-index and filtering based off another column value

Question

1 answers

solution1 0 2020-01-16 16:53:58

solution1
0 2020-01-16 16:53:58