简体   繁体   中英

Using loc and map to change values in multiple columns

I have two DataFrames, that have same column names, and each row has unique Import_ID. I want to import the missing values in 2 columns from df1 to df2 based on the same Import_ID.

I already did this for one column and it works fine, but I want to do it for 2 columns at the same time. For one column I wrote :

df2.loc[(numpy.isnan(df2['DeliveryNoteNo']))& (~numpy.isnan(df1['DeliveryNoteNo'])), 'DeliveryNoteNo'] = df2['Import_ID'].map(df1.set_index('Import_ID')['DeliveryNoteNo'])

And this works fine, so I wanted to do same for 2 columns, so that every time I do an update do df2 it also gives the date when it was updated.

I tried this, but it only returns me an error :" TypeError: 'DataFrame' object is not callable"

df2.loc[(numpy.isnan(df2.InvoiceNo))& (~numpy.isnan(df1['InvoiceNo'])), ['InvoiceNo','Modified_Date']] = df2['Import_ID'].map(df1.set_index('Import_ID')[['InvoiceNo', 'Modified_Date']])

For example : df1 :

     InvoiceNo     OrderNo  DeliveryNoteNo     Modified_Date   Import_ID
0   950094591.0  7027514279    1.000000e+00  23-08-2019 14:30  7027514279_100
1   950094591.0  7027514279    2.000000e+00  23-08-2019 14:30  7027514279_100.1
2                7027514279                  23-08-2019 14:30  7027514279_100.2

df2:

     InvoiceNo     OrderNo  DeliveryNoteNo     Modified_Date   Import_ID
0                7027514279    1.000000e+00  21-08-2019 14:30  7027514279_100
1   950094591.0  7027514279                  21-08-2019 14:30  7027514279_100.1
2                7027514279                  21-08-2019 14:30  7027514279_100.2

df2 should later look like this:

     InvoiceNo     OrderNo  DeliveryNoteNo     Modified_Date   Import_ID
0   950094591.0  7027514279    1.000000e+00  23-08-2019 14:30  7027514279_100
1   950094591.0  7027514279    2.000000e+00  23-08-2019 14:30  7027514279_100.1
2                7027514279                  21-08-2019 14:30  7027514279_100.2

Try this

df2.set_index('Import_ID').combine_first(df1.set_index('Import_ID')).reset_index(drop=True)

Output:

        InvoiceNo     OrderNo  DeliveryNoteNo     Modified_Date
0     950094591.0  7027514279             1.0  21-08-2019 14:30
1     950094591.0  7027514279             2.0  21-08-2019 14:30
2             NaN  7027514279             NaN  21-08-2019 14:30

Have you tried using Non-Exhaustive Mapping by map with fillna combined?

Basically, you need to create two dictionaries first, based on the column values from your df1 to place the values you want to update in your 2 columns in df2 :

dictionary_1 = dict(zip(df1['Import_ID'], df1['DeliveryNoteNo']))
dictionary_2 = dict(zip(df1['Import_ID'], df1['InvoiceNo']))

Then, you use these dictionaries to update df2 but using fillna with parameters equal to the original value in df2 column if it returns False :

df2['DeliveryNoteNo'] = df2['Import_ID'].map(dictionary_1).fillna(df2['DeliveryNoteNo'])

Do the same for the second column to update:

df2['InvoiceNo'] = df2['Import_ID'].map(dictionary_1).fillna(df2['InvoiceNo'])

The fillna parameter will not give a Nan value to your column if the map returns False , which means, it won't update any existing value that does not have the same id as in keys in both of your dictionaries.

Hope this helps :)).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM