简体   繁体   中英

Replace values in python pandas column based on second df

I have gone through all similar questions on stackoverflow, but the solutions still don't work for me.

I have two dfs:

df1:
User_ID |    Code_1
123           htrh
345           NaN
567           cewr
...

df2:
User_ID |    Code_2
123           ert
345           nad

I want to replace df1.Code_1 with df2.Code_2 based on User_ID. Please note that df2 is a subset of df1's user_ids.

I tried this

df1['Code_1'] = df1['User_ID'].replace(df2.set_index('User_ID')['Code_2'])

and I tried this

df1.loc[df1.User_ID.isin(df2.User_ID), ['Code_1']] = df2[['Code_2']]

and both didn't work.Nothing changed.

Expected Output:

df1:
    User_ID |    Code_1
    123           ert
    345           nad
    567           cewr
    ...

Thank You

Use DataFrame.update . The id columns ( User_ID ) and the code columns ( Code_1 , Code_2 ) should have the same name across the dataframes before calling the function.

df2.columns = ['User_ID', 'Code_1']
df1.update(df2)

That should be enough for your case. For other uses, consult the documentation

You can use combine_first

df2.set_index('User_ID').Code_2.combine_first(df1.set_index('User_ID').Code_1)


User_ID
123     ert
345     nad
567    cewr

You can use pd.Series.map + pd.Series.fillna .

df1['Code_1'] = df1['User_ID'].map(df2.set_index('User_ID')['Code_2'])\
                              .fillna(df1['Code_1'])

print(df1)

#    User_ID Code_1
# 0      123    ert
# 1      345    nad
# 2      567   cewr

The idea is to align indices when you perform the mapping and fill with original values if no mapping exists in df2 .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM