简体   繁体   中英

How can I replace missing values in a dataframe with values from another dataframe?

I have two dataframes of different shapes. I want to fill in missing data in my df1 from data that exists in df2.

How do I join these two datasets while keeping the original shape and columns of df1?

I have tried using pd.merge, but I don't think I am getting the syntax right. I have created new columns in the dataframe, but I'm not able to only add data to the NaN values.

I have also tried using combine first, but I don't think I'm doing that right either.

df1 = pd.DataFrame({'a': ["dogs","cats","birds","turtles"], 'b': [1,5,"NA",10]})
print(df1)

df2 = pd.DataFrame({'a': ["birds"],'b': [6]})
print(df2)

df_Final = pd.DataFrame({'a': ["dogs","cats","birds","turtles"], 'b': [1,5,6,10]})
print(df_Final)

I expect the output to be the df_Final dataframe shown here, where the "birds" value, is populated with df2.

fuelbaby

How about this ?

df1['b'] = df1['b'].where(df1['b']!=('NA'), df1['a'].map(df2.set_index('a')['b']))

Out[166]: 
         a   b
0     dogs   1
1     cats   5
2    birds   6
3  turtles  10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM