简体   繁体   中英

Pandas - Applying Function to every other row

I have a data frame and what I am trying to do is essentially tabulate the score of the winning and losing team in the same spot. I have tried to put a lambda function, but have had no success with it. The data frame I currently have is the first one and I would like to create a dataset in the form of the second question. Thanks.

在此处输入图像描述

一个

GameId      Team    Home    Score
1           Spirit  1       81
1           Rockers 0       66
2           Lightning   1   73
2           Flames  0       82


Game ID Home Team   Away Team   Home Score  Away Score
1       Spirit      Rockers     81          66
2       Lightning   Flames      73          82

Try this:

Input:

import pandas as pd

raw_df = pd.DataFrame({"GameId": [1, 1, 2, 2],
                       "Team": ["Spirit", "Rockets", "Lighting", "Flames"],
                       "Home": [1, 0, 1, 0],
                       "Score": [81, 66, 73, 82]})
print(raw_df)

Output:

   GameId      Team  Home  Score
0       1    Spirit     1     81
1       1   Rockets     0     66
2       2  Lighting     1     73
3       2    Flames     0     82

Input:

raw_df.loc[:, "Home"] = raw_df.Home.map({
        1: "Home",
        0: "Away"
    })

result = raw_df.pivot_table(index=["GameId"],
                            columns=["Home"],
                            values=["Team", "Score"],
                            aggfunc={"Team": lambda team: " ".join(team.tolist()),
                                     "Score": lambda score: score})

result = result.sort_index(axis="columns", level=[0, "Home"], ascending=False)
result.columns = [' '.join(reversed(col)) for col in result.columns]
print(result)

Output:

       Home Team Away Team  Home Score  Away Score
GameId                                            
1         Spirit   Rockets          81          66
2       Lighting    Flames          73          82
import pandas as pd
df=pd.DataFrame({'GameId':[1,1,2,2],'Team': ['Spirit','Rockers','Lighting','Flames'],'Home':[1,0,1,0],'Score':[81,66,73,82]})
merge=pd.merge(df,df,left_on='GameId',right_on='GameId')
merge=merge[merge['Home_x']!=merge['Home_y']]
merge=merge.drop_duplicates(subset=['GameId'])
merge=merge[['GameId','Team_x','Team_y','Score_x','Score_y']]
merge.columns=['GameId','Home Team','Away Team','Home Score','Away Score']

在此处输入图像描述

Explanation: using pd.merge(), I am performing a self join. After this, I am removing rows with same team names in both home & away columns. Dropping duplicates on gameId afterwards followed by selecting required columns & renaming them

First use .pivot and then do some list comprehension to rename the columns from tuples to the desired names (the columns are tuples as a result of setting Home as a column when pivoting). [::-1] reverses the name from eg Team Home to Home Team, when joining the Tuples in the list comprehension.

df = pd.pivot(df, columns='Home', values=['Team','Score'], index='GameId').reset_index()
df.columns = [' '.join(str(s).strip().replace('1', 'Home').replace('0', 'Away') for s in col[::-1]) for col in df.columns]

Ouput:

    GameId  Away Team   Home Team   Away Score  Home Score
0   1       Rockers     Spirit      66          81
1   2       Flames      Lightning   82          73

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM