简体   繁体   中英

Add new column to Dataframe by looking up values in Dictionary

I have a Pandas dataframe with sport results per tournament as follows (simplified):

Tournament  WinnerName  LoserName
t1          A           X
t1          B           Y
t1          C           Y
t2          A           X
t2          B           Y
t2          C           Y

In a dictionary I have information about the players' ranks per tournament:

Tournament  Player  Rank
t1          A       1
t1          B       7
t1          C       70
t2          A       11
t2          B       1
t2          C       100

Now I want to know how often per tournament the winner of a match is ranked in one of these categories: a) between 1 and 10, b) between 11 and 49, c) greater than 49.

So the result could either look like this:

Tournament  WinnerName  LoserName   Group
t1          A           X           a
t1          B           Y           a
t1          C           Y           c
t2          A           X           b
t2          B           Y           a
t2          C           Y           c

or like this:

Tournament  WinnerName  LoserName   GroupA  GroupB  GroupC
t1          A           X           1       0       0
t1          B           Y           1       0       0
t1          C           Y           0       0       1
t2          A           X           0       1       0
t2          B           Y           1       0       0
t2          C           Y           0       0       1

After that I can easily count the occurrences per column. But currently I am stuck in achieving one of the two given results. I know it should work somehow with apply or transform , but I have no precise idea unfortunately. Maybe there is even a better solutions to achieve this?

Thank you.

From the Rank (column) you can cut and get_dummies:

In [11]: r
Out[11]:
0      1
1      7
2     70
3     11
4      1
5    100
Name: Rank, dtype: int64

In [12]: pd.cut(r, [0, 10, 49, 100], include_lowest=True)
Out[12]:
0      [0, 10]
1      [0, 10]
2    (49, 100]
3     (10, 49]
4      [0, 10]
5    (49, 100]
Name: Rank, dtype: category
Categories (3, object): [[0, 10] < (10, 49] < (49, 100]]

In [13]: pd.get_dummies(pd.cut(r, [0, 10, 49, 100], include_lowest=True))
Out[13]:
   [0, 10]  (10, 49]  (49, 100]
0        1         0          0
1        1         0          0
2        0         0          1
3        0         1          0
4        1         0          0
5        0         0          1

Now you can join/whatever these with your original DataFrames.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM