There are two data frames df1 and df2. Two columns in df1 are A and B. There are missing values in B. For the missing values in B in df1, there are entries in df2 and its columns are A and B (The records in df2 are the ones which are missing in B of df1 only). I want to replace the missing values of B in df1 with corresponding entry of B from df2.
EDIT:
df1 = pd.DataFrame({'A': [1,2,3,4,5,6,7,8], 'b': [101,123,np.nan,678,np.nan,672,np.nan,786], 'C': ['ABC', 'DER', 'ERC','DFE','HJI','JKL','SDH',np.Nan]})
df2 = pd.DataFrame({'A': [3,7], 'B': [563,785]})
Desired O/P:
op = pd.DataFrame({'A': [1,2,3,4,5,6,7,8], 'b': [101,123,563,678,np.nan,672,785,786], 'C': ['ABC', 'DER', 'ERC','DFE','HJI','JKL','SDH',np.Nan]})
Use, pd.merge
to left merge the dataframes df1
& df2
on column A
, then using Series.fillna
fill the missing values in column b
from the values in column B
:
df = pd.merge(df1, df2, on='A', how='left')
df['b'] = df['b'].fillna(df.pop('B'))
Result:
# print(df)
A b C
0 1 101.0 ABC
1 2 123.0 DER
2 3 563.0 ERC
3 4 678.0 DFE
4 5 NaN HJI
5 6 672.0 JKL
6 7 785.0 SDH
7 8 786.0 NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.