简体   繁体   中英

How to create a new dataframe that contains the value changes from multiple columns between two exisitng dataframes

I am looking at football player development over a five year period.

I have two dataframes (DFs), one that contains all 20 year-old strikers from FIFA 17 and another that contains all 25 year-old strikers from FIFA 22. I want to create a third DF that contains the attribute changes for each player. There are about 30 columns denoting each attribute, eg tackling, shooting, passing etc. So I want the new DF to contain +3 for tackling, +2 for shooting, +6 for passing etc.

The best way of solving this that I can think of is by merging the two DFs and then applying a function to every column that gives the difference between the x and y values, which represent the FIFA 17 and FIFA 22 data respectively.

Any tips much appreciated. Thank you.

You might subtract pandas.DataFrame s consider following simple example

import pandas as pd
df1 = pd.DataFrame({'X':[1,2],'Y':[3,4]})
df2 = pd.DataFrame({'X':[10,20],'Y':[30,40]})
dfdiff = df2 - df1
print(dfdiff)

gives output

    X   Y
0   9  27
1  18  36

I have found a solution but it is very tedious as it requires a line of code for each and every attribute.

I'm simply assigning a new column for each attribute change. So for Passing, for instance, the code is:

mergedDF = mergedDF.assign(PassingChange = mergedDF.Passing_x - mergedDF.Passing_y)

As stated, use the difference of the dataframes. I'm suspecting they are not ALL NaN values, as you'll only get that for rows where the same player isn't in both 17 and 22 Fifas.

When I do it, there are only 533 player in both 17 and 22 (that were 20 years old in Fifa 17 and 25 in Fifa 22).

Here's an example:

import pandas as pd

fifa17 = pd.read_csv('D:/test/fifa/players_17.csv')
fifa17 = fifa17[fifa17['age'] == 20]
fifa17 = fifa17.set_index('sofifa_id')

fifa22 = pd.read_csv('D:/test/fifa/players_22.csv')
fifa22 = fifa22[fifa22['age'] == 25]
fifa22 = fifa22.set_index('sofifa_id')

compareCols = ['pace', 'shooting', 'passing', 'dribbling', 'defending', 
               'physic', 'attacking_crossing', 'attacking_finishing', 
               'attacking_heading_accuracy', 'attacking_short_passing', 
               'attacking_volleys', 'skill_dribbling', 'skill_curve', 
               'skill_fk_accuracy', 'skill_long_passing', 
               'skill_ball_control', 'movement_acceleration', 
               'movement_sprint_speed', 'movement_agility', 
               'movement_reactions', 'movement_balance', 'power_shot_power', 
               'power_jumping', 'power_stamina', 'power_strength', 
               'power_long_shots', 'mentality_aggression', 
               'mentality_interceptions', 'mentality_positioning', 
               'mentality_vision', 'mentality_penalties', 
               'mentality_composure', 'defending_marking_awareness', 
               'defending_standing_tackle', 'defending_sliding_tackle']

df = fifa22[compareCols] - fifa17[compareCols]


df = df.dropna(axis=0)
df = pd.merge(df,fifa22[['short_name']], how = 'left', left_index=True, right_index=True)

Output:

print(df)
           pace  shooting  ...  defending_sliding_tackle     short_name
sofifa_id                  ...                                         
205291     -1.0       0.0  ...                       3.0     H. Stengel
205988     -7.0       3.0  ...                      -1.0        L. Shaw
206086      0.0       8.0  ...                       5.0     H. Toffolo
206113     -2.0      21.0  ...                      -2.0      S. Gnabry
206463     -3.0       8.0  ...                       3.0     J. Dudziak
        ...       ...  ...                       ...            ...
236311     -2.0      -1.0  ...                      18.0         M. Rog
236393      2.0       5.0  ...                       0.0   Marc Cardona
236415      3.0       1.0  ...                       9.0      R. Alfani
236441     10.0      31.0  ...                      18.0      F. Bustos
236458      1.0       0.0  ...                       5.0  A. Poungouras

[533 rows x 36 columns]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM