简体   繁体   中英

Map Counter Object to DataFrame to create new column

Problem


I have a data set with played football games and I'm trying to look more closely into the games played by each team at home. Therefore, I am trying to create a column for every team with a counter that tracks the consecutive wins. The following logic should apply:

  • If win: increase counter += 1
  • If tie: keep counter
  • If loss: decrease counter -= 1
df[['Date', 'Home', 'Home result']].head()

|      | Date     | Home           | Home result |
| ---- | -------- | -------------- | ----------- |
| 0    | 1/1/2018 | Stoke City     | Defeat      |
| 1    | 1/1/2018 | Brighton       | Tie         |
| 2    | 1/1/2018 | Burnley        | Defeat      |
| 3    | 1/1/2018 | Leicester City | Victory     |

I created a function that outputs a dictionary that can be mapped to the DataFrame (sorted by Date) to create the new counter column.


    def result_counter(df):
        d = {}
        for index, row in df.iterrows():
            counter = 0
            for second_index, second_row in df.iloc[:index+1].iterrows(): 
                if second_row['Home result'] == 'Victory':
                    counter += 1
                elif second_row['Home result'] == 'Tie':
                    counter += 0
                else:
                    counter -= 1
            if second_row['Home'] not in d:
                d[second_row['Home']] = [counter]
            else:
                d[second_row['Home']].append(counter)
                
        return d

Expected Result


A new column should be created that indicates the consecutive score of the home team. The expected result for a subset should look like this:

|      | Date       | Home      | Home result | Counter |
| ---- | ---------- | --------- | ----------- | ------- |
| 0    | 2017-08-19 | Liverpool | Victory     | 1       |
| 1    | 2017-08-27 | Liverpool | Victory     | 2       |
| 2    | 2017-09-16 | Liverpool | Tie         | 2       |
| 3    | 2017-10-14 | Liverpool | Defeat      | 1       |

What I tried


I tried to map the dict to the new DataFrame. When doing this I either get a map object displayed or the whole list that corresponds to the key in the dictionary.

# with subset and map 
sorted_liverpool['Counter'] = map(sorted_liverpool, d)
sorted_liverpool[['Date', 'Home', 'Home result', 'Counter']].head()

|      | Date       | Home      | Home result | Counter                        |
| ---- | ---------- | --------- | ----------- | ------------------------------ |
| 0    | 2017-08-19 | Liverpool | Victory     | <map object at 0x7fc0650fe550> |
| 1    | 2017-08-27 | Liverpool | Victory     | <map object at 0x7fc0650fe550> |
| 2    | 2017-09-16 | Liverpool | Tie         | <map object at 0x7fc0650fe550> |
| 3    | 2017-10-14 | Liverpool | Defeat      | <map object at 0x7fc0650fe550> |
# with subset and map directly to "home" column
sorted_liverpool['Counter'] = sorted_liverpool['Home'].map(d)
sorted_liverpool[['Date', 'Home', 'Home result', 'Counter']].head()

|      | Date       | Home      | Home result | Counter                                           |
| ---- | ---------- | --------- | ----------- | ------------------------------------------------- |
| 0    | 2017-08-19 | Liverpool | Victory     | [1, 2, 2, 2, 3, 4, 4, 4, 4, 5, 6, 7, 7, 8, 9, ... |
| 1    | 2017-08-27 | Liverpool | Victory     | [1, 2, 2, 2, 3, 4, 4, 4, 4, 5, 6, 7, 7, 8, 9, ... |
| 2    | 2017-09-16 | Liverpool | Tie         | [1, 2, 2, 2, 3, 4, 4, 4, 4, 5, 6, 7, 7, 8, 9, ... |
| 3    | 2017-10-14 | Liverpool | Defeat      | [1, 2, 2, 2, 3, 4, 4, 4, 4, 5, 6, 7, 7, 8, 9, ... |

How can I map the created dictionary to the DataFrame so it displays only the corresponding value in the list and not the whole list?

Avoid loop, especially iterrows when you can.

You need to map the Home result to respective value, then cumsum:

df['Counter'] = df['Home result'].map({
   'Victory': 1,
   'Tie': 0,
   'Defeat': -1
}).cumsum()

Also, you might need to groupby on Home :

df['Counter'] = df['Home result'].map({
       'Victory': 1,
       'Tie': 0,
       'Defeat': -1
    }).groupby(df['Home']).cumsum()

Output:

         Date       Home Home result  Counter
0  2017-08-19  Liverpool     Victory        1
1  2017-08-27  Liverpool     Victory        2
2  2017-09-16  Liverpool         Tie        2
3  2017-10-14  Liverpool      Defeat        1
print(df)

        Date        Home Home result
0 2017-08-19   Liverpool     Victory
1 2017-08-27   Liverpool     Victory
2 2017-09-16   Liverpool         Tie
3 2017-10-14   Liverpool      Defeat
4 2017-08-20  Stoke City      Defeat
5 2017-08-28  Stoke City         Tie
6 2017-09-17  Stoke City      Defeat
7 2017-10-15  Stoke City     Victory

    df['new column'] = df['Home result'].map({'Victory': 1, 'Tie': 0, 'Defeat': -1})
    df['new column'] = df.groupby('Home')['new column'].transform(pd.Series.cumsum)
    print(df)

        Date        Home Home result  new column
0 2017-08-19   Liverpool     Victory           1
1 2017-08-27   Liverpool     Victory           2
2 2017-09-16   Liverpool         Tie           2
3 2017-10-14   Liverpool      Defeat           1
4 2017-08-20  Stoke City      Defeat          -1
5 2017-08-28  Stoke City         Tie          -1
6 2017-09-17  Stoke City      Defeat          -2
7 2017-10-15  Stoke City     Victory          -1

try group by home and accumulative sum cumsum the results of an apply logic

txt="""Date,Home,Home Result
1/1/2018,Stoke City,Defeat
1/1/2018,Brighton,Tie
1/1/2018,Burnley,Defeat
1/1/2018,Leicester City,Victory
2017-08-19,Liverpool,Victory
2017-08-27,Liverpool,Victory
2017-09-16,Liverpool,Tie
2017-10-14,Liverpool,Defeat"""

df = pd.read_table(StringIO(txt), sep=',')    
df['Date']=pd.to_datetime(df['Date'])
df['Outcome']=df['Home Result'].apply(lambda x: -1 if x=='Defeat' else 0 if x=='Tie' else 1 if x=='Victory' else 0)
df=df.sort_values(by=['Home','Date'],ascending=True)

df['Date_Total'] = df.groupby(['Home'])['Outcome'].cumsum()
print(df)

output

 Date            Home Home Result  Outcome  Date_Total
1 2018-01-01        Brighton         Tie        0           0
2 2018-01-01         Burnley      Defeat       -1          -1
3 2018-01-01  Leicester City     Victory        1           1
4 2017-08-19       Liverpool     Victory        1           1
5 2017-08-27       Liverpool     Victory        1           2
6 2017-09-16       Liverpool         Tie        0           2
7 2017-10-14       Liverpool      Defeat       -1           1
0 2018-01-01      Stoke City      Defeat       -1          -1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM