简体   繁体   中英

Fill column of a dataframe from another dataframe

I'm trying to fill a column of a dataframe from another dataframe based on conditions. Let's say my first dataframe is df1 and the second is named df2.

# df1 is described as bellow :
+------+------+
| Col1 | Col2 |
+------+------+
|   A  |  1   |
|   B  |  2   |
|   C  |  3   |
|   A  |  1   |
+------+------+

And

# df2 is described as bellow :
+------+------+
| Col1 | Col2 |
+------+------+
|   A  |  NaN |
|   B  |  NaN |
|   D  |  NaN |
+------+------+

Each distinct value of Col1 has her an id number (In Col2), so what I want is to fill the NaN values in df2.Col2 where df2.Col1==df1.Col1 . So that my second dataframe will look like :

# df2 :
+------+------+
| Col1 | Col2 |
+------+------+
|   A  |  1   |
|   B  |  2   |
|   D  |  NaN |
+------+------+

I'm using Python 2.7

Use drop_duplicates with set_index and combine_first :

df = df2.set_index('Col1').combine_first(df1.drop_duplicates().set_index('Col1')).reset_index()

If need check dupes only in id column:

df = df2.set_index('Col1').combine_first(df1.drop_duplicates().set_index('Col1')).reset_index()

Here is a solution with the filter df1.Col1 == df2.Col1

df2['Col2'] = df1[df1.Col1 == df2.Col1]['Col2']

It is even better to use loc (but less clear from my point of view)

df2['Col2'] = df1.loc[df1.Col1 == df2.Col2, 'Col2']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM