I have two DataFrames, that have same column names, and each row has unique Import_ID. I want to import the missing values in 2 columns from df1 to df2 based on the same Import_ID.
I already did this for one column and it works fine, but I want to do it for 2 columns at the same time. For one column I wrote :
df2.loc[(numpy.isnan(df2['DeliveryNoteNo']))& (~numpy.isnan(df1['DeliveryNoteNo'])), 'DeliveryNoteNo'] = df2['Import_ID'].map(df1.set_index('Import_ID')['DeliveryNoteNo'])
And this works fine, so I wanted to do same for 2 columns, so that every time I do an update do df2 it also gives the date when it was updated.
I tried this, but it only returns me an error :" TypeError: 'DataFrame' object is not callable"
df2.loc[(numpy.isnan(df2.InvoiceNo))& (~numpy.isnan(df1['InvoiceNo'])), ['InvoiceNo','Modified_Date']] = df2['Import_ID'].map(df1.set_index('Import_ID')[['InvoiceNo', 'Modified_Date']])
For example : df1 :
InvoiceNo OrderNo DeliveryNoteNo Modified_Date Import_ID
0 950094591.0 7027514279 1.000000e+00 23-08-2019 14:30 7027514279_100
1 950094591.0 7027514279 2.000000e+00 23-08-2019 14:30 7027514279_100.1
2 7027514279 23-08-2019 14:30 7027514279_100.2
df2:
InvoiceNo OrderNo DeliveryNoteNo Modified_Date Import_ID
0 7027514279 1.000000e+00 21-08-2019 14:30 7027514279_100
1 950094591.0 7027514279 21-08-2019 14:30 7027514279_100.1
2 7027514279 21-08-2019 14:30 7027514279_100.2
df2 should later look like this:
InvoiceNo OrderNo DeliveryNoteNo Modified_Date Import_ID
0 950094591.0 7027514279 1.000000e+00 23-08-2019 14:30 7027514279_100
1 950094591.0 7027514279 2.000000e+00 23-08-2019 14:30 7027514279_100.1
2 7027514279 21-08-2019 14:30 7027514279_100.2
Try this
df2.set_index('Import_ID').combine_first(df1.set_index('Import_ID')).reset_index(drop=True)
Output:
InvoiceNo OrderNo DeliveryNoteNo Modified_Date
0 950094591.0 7027514279 1.0 21-08-2019 14:30
1 950094591.0 7027514279 2.0 21-08-2019 14:30
2 NaN 7027514279 NaN 21-08-2019 14:30
Have you tried using Non-Exhaustive Mapping by map
with fillna
combined?
Basically, you need to create two dictionaries first, based on the column values from your df1
to place the values you want to update in your 2 columns in df2
:
dictionary_1 = dict(zip(df1['Import_ID'], df1['DeliveryNoteNo']))
dictionary_2 = dict(zip(df1['Import_ID'], df1['InvoiceNo']))
Then, you use these dictionaries to update df2
but using fillna
with parameters equal to the original value in df2
column if it returns False
:
df2['DeliveryNoteNo'] = df2['Import_ID'].map(dictionary_1).fillna(df2['DeliveryNoteNo'])
Do the same for the second column to update:
df2['InvoiceNo'] = df2['Import_ID'].map(dictionary_1).fillna(df2['InvoiceNo'])
The fillna
parameter will not give a Nan
value to your column if the map
returns False
, which means, it won't update any existing value that does not have the same id as in keys in both of your dictionaries.
Hope this helps :)).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.