简体   繁体   中英

Compare elements of two pandas data frame columns and create a new column based on a third column

I have two dataframes:

df1:

| ID | PersonID | Sex |
|:--:|:--------:|:---:|
|  1 |    123   |  M  |
|  2 |    124   |  F  |
|  3 |    125   |  F  |
|  4 |    126   |  F  |
|  5 |    127   |  M  |
|  6 |    128   |  M  |
|  7 |    129   |  F  |

df2:
| ID | PersonID | Infected |
|:--:|:--------:|:--------:|
|  1 |    125   |   True   |
|  2 |    124   |   False  |
|  3 |    126   |   False  |
|  4 |    128   |   True   |

I'd like to compare the person IDs in both these dataframes and insert the corresponding Infected value into df1 and False if the personID is not matched. The output would ideally look like this:

df1:
| ID | PersonID | Sex | Infected |
|:--:|:--------:|:---:|:--------:|
|  1 |    123   |  M  |   False  |
|  2 |    124   |  F  |   False  |
|  3 |    125   |  F  |   True   |
|  4 |    126   |  F  |   False  |
|  5 |    127   |  M  |   False  |
|  6 |    128   |  M  |   True   |
|  7 |    129   |  F  |   False  |

I have a for loop coded and it takes too long and is not very readable. Is there an efficient way to do this? Thanks!

一种方法是为df1['PersonID'].map()提供一个Series,该Series的索引为PersonID并且值被Infected

df1['Infected'] = df1['PersonID'].map(df2.set_index('PersonID')['Infected']).fillna(False)

Another approach is to use pd.merge

df1 = pd.merge(df1, df2[['PersonID', 'Infected']], on=['PersonID'], how='left').fillna(False)

Or

df1 = df1.merge(df2[['PersonID', 'Infected']], on=['PersonID'], how='left').fillna(False)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM