简体   繁体   中英

Matching names in pandas dataframe and updating the target value

I have a dataframe that looks like this

               Player   Pos Team    Opp Def v Pos  Salary  StdDev Own  My Proj  Target_Score
0          Jared Goff  CPTN  LAR  vs NE       7th   16500     9.0  0%    27.41           NaN
1          Cam Newton  CPTN   NE  @ LAR       2nd   15900     8.9  0%    26.75           NaN
2        Robert Woods  CPTN  LAR  vs NE       8th   13800     8.6  0%    22.68           NaN
3         Cooper Kupp  CPTN  LAR  vs NE       8th   14400     9.2  0%    21.84           NaN
4       Jakobi Meyers  CPTN   NE  @ LAR       2nd   10200     7.0  0%    21.59           NaN
5          Jared Goff    QB  LAR  vs NE       7th   11000     9.0  0%    18.27          33.2
6          Cam Newton    QB   NE  @ LAR       2nd   10600     8.9  0%    17.83          33.2
7           Cam Akers  CPTN  LAR  vs NE      22nd   12000     4.8  0%    17.82           NaN
8       Damien Harris  CPTN   NE  @ LAR       4th   13200     5.2  0%    16.16           NaN
9        Robert Woods    WR  LAR  vs NE       8th    9200     8.6  0%    15.12          30.7
10        Cooper Kupp    WR  LAR  vs NE       8th    9600     9.2  0%    14.56          30.7
11      Jakobi Meyers    WR   NE  @ LAR       2nd    6800     7.0  0%    14.39          28.1
12       Damiere Byrd  CPTN   NE  @ LAR       2nd    1200     6.6  0%    13.59           NaN
13           Matt Gay  CPTN  LAR  vs NE      25th    5700     3.4  0%    13.35           NaN
14        James White  CPTN   NE  @ LAR       4th    9600     7.5  0%    12.75           NaN
15     Gerald Everett  CPTN  LAR  vs NE       9th    5100     5.2  0%    12.32           NaN
16          Cam Akers    RB  LAR  vs NE      22nd    8000     4.8  0%    11.88          30.0
17       Tyler Higbee  CPTN  LAR  vs NE       9th    7500     6.7  0%    11.60           NaN
18          Nick Folk  CPTN   NE  @ LAR       5th    6000     3.7  0%    11.46           NaN
19               Rams  CPTN  LAR  vs NE      27th    6600     6.8  0%    11.24           NaN
20      Josh Reynolds  CPTN  LAR  vs NE       8th    8400     5.7  0%    11.21           NaN
21       N Keal Harry  CPTN   NE  @ LAR       2nd    2700     4.3  0%    10.85           NaN
22      Damien Harris    RB   NE  @ LAR       4th    8800     5.2  0%    10.77          30.0
23       Damiere Byrd    WR   NE  @ LAR       2nd     800     6.6  0%     9.06          10.0
24           Matt Gay     K  LAR  vs NE      25th    3800     3.4  0%     8.90           NaN
25        James White    RB   NE  @ LAR       4th    6400     7.5  0%     8.50          29.0
26     Gerald Everett    TE  LAR  vs NE       9th    3400     5.2  0%     8.21          24.0
27  Darrell Henderson  CPTN  LAR  vs NE      22nd   11100     7.1  0%     7.92           NaN
28       Tyler Higbee    TE  LAR  vs NE       9th    5000     6.7  0%     7.73          26.3
29          Nick Folk     K   NE  @ LAR       5th    4000     3.7  0%     7.64           NaN
30           Patriots  CPTN   NE  @ LAR      10th    6300     6.8  0%     7.59           NaN
31               Rams   DST  LAR  vs NE      27th    4400     6.8  0%     7.49          15.3
32      Josh Reynolds    WR  LAR  vs NE       8th    5600     5.7  0%     7.47          28.1

I need to iterate over every player and match player names and then update the Target_Score column.

Example:

Match Jared Goff CPTN and Jared Goff QB

for index, row in player_df.iterrows():
    # if player name with position CPTN == player name with position QB
    # player name with position CPTN Target_Score = player name with position 
    # QB['Target_Score'] * 1.5

Basically I need to match every player with their corresponding CPTN position and update the CPTN target score to be the position target score * 1.5 iteratively.

Documented inline

# Some test data
player_df = pd.DataFrame({ 'Player': ['Jared Goff', 'Cam Newton']*2,
                    'Pos': ['CPTN','CPTN', 'QB', 'WR'],
                    'Target_Score': [0,1,2,3]}) 

new_player_df = player_df.copy()
print (new_player_df)

print ("*"*10)
# Iterate overs players with CPTN 
for index, row in player_df[player_df['Pos'] == 'CPTN'].iterrows():
  # Get the match
  match = player_df[(player_df['Player'] == row['Player']) & 
                    (player_df['Pos'] != 'CPTN')]
  # If there is a match
  if len(match) > 0:
    # Repace the score
    new_player_df.loc[index, 'Target_Score'] = match.iloc[0]['Target_Score'] * 1.5

print (new_player_df)

Output:

       Player   Pos  Target_Score
0  Jared Goff  CPTN             0
1  Cam Newton  CPTN             1
2  Jared Goff    QB             2
3  Cam Newton    WR             3
**********
       Player   Pos  Target_Score
0  Jared Goff  CPTN           3.0
1  Cam Newton  CPTN           4.5
2  Jared Goff    QB           2.0
3  Cam Newton    WR           3.0

Select the scores that are your ground truth, operate on them, then map them back onto the CPTN positions. This assumed the players only occur once in CPTN and non-CPTN positions.

cptn_selector = df["Pos"] == "CPTN"
other_selector = ~cptn_selector
players_and_scores = df.loc[other_selector, ["Player", "Target_Score"]]
players_and_scores["Target_Score_CPTN"] = players_and_scores["Target_Score"].values * 1.5
name_to_score_mapping = pd.Series(
    players_and_scores["Target_Score"].values,
    index=players_and_scores["Player"]
).to_dict()
df.loc[cptn_selector, "Target_Score"] = df.loc[cptn_selector, "Player"].map(name_to_score_mapping)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM