Compare elements of two pandas data frame columns and create a new column based on a third column

Question

I have two dataframes:

df1:

| ID | PersonID | Sex |
|:--:|:--------:|:---:|
|  1 |    123   |  M  |
|  2 |    124   |  F  |
|  3 |    125   |  F  |
|  4 |    126   |  F  |
|  5 |    127   |  M  |
|  6 |    128   |  M  |
|  7 |    129   |  F  |

df2:
| ID | PersonID | Infected |
|:--:|:--------:|:--------:|
|  1 |    125   |   True   |
|  2 |    124   |   False  |
|  3 |    126   |   False  |
|  4 |    128   |   True   |

I'd like to compare the person IDs in both these dataframes and insert the corresponding Infected value into df1 and False if the personID is not matched. The output would ideally look like this:

df1:
| ID | PersonID | Sex | Infected |
|:--:|:--------:|:---:|:--------:|
|  1 |    123   |  M  |   False  |
|  2 |    124   |  F  |   False  |
|  3 |    125   |  F  |   True   |
|  4 |    126   |  F  |   False  |
|  5 |    127   |  M  |   False  |
|  6 |    128   |  M  |   True   |
|  7 |    129   |  F  |   False  |

I have a for loop coded and it takes too long and is not very readable. Is there an efficient way to do this? Thanks!

Answer 1

一种方法是为df1['PersonID'].map()提供一个Series，该Series的索引为PersonID并且值被Infected ：

df1['Infected'] = df1['PersonID'].map(df2.set_index('PersonID')['Infected']).fillna(False)

Answer 2

Another approach is to use pd.merge

df1 = pd.merge(df1, df2[['PersonID', 'Infected']], on=['PersonID'], how='left').fillna(False)

Or

df1 = df1.merge(df2[['PersonID', 'Infected']], on=['PersonID'], how='left').fillna(False)

Compare elements of two pandas data frame columns and create a new column based on a third column

Question

2 answers

solution1
1 ACCPTED 2019-03-14 03:01:50

solution2
0 2019-03-14 06:36:35

Compare elements of two pandas data frame columns and create a new column based on a third column

Question

2 answers

solution1 1 ACCPTED 2019-03-14 03:01:50

solution2 0 2019-03-14 06:36:35

solution1
1 ACCPTED 2019-03-14 03:01:50

solution2
0 2019-03-14 06:36:35