I want to add a new column dfout['EXCHANGE_RATIO']
whose values (rows) will be taken from another dataframe ( dfc['EXCHANGE_RATIO']
) only when dfout['CURRENCY'] != 'EUR'
. When dfout['CURRENCY'] != 'EUR'
I search for that value in dfc['CURRENCY_SOURCE']
and I take the value of dfc['EXCHANGE_RATIO']
in that same row.
dfout
looks like this:
DATE_PROCESS BOOKING_ID DEP_AIRPORT ARR_AIRPORT DEPARTURE_DATE ARRIVAL_DATE PRICE CURRENCY
0 2013-04-19 16:04:13 UTC 76969972 AEL DEL 2013-04-18 00:00:00 NaN 409.04 EUR
1 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 EUR
dfc
looks like this:
CURRENCY_SOURCE CURRENCY_TARGET EXCHANGE_RATIO
0 TRL EUR 9.900000e-08
1 VES EUR 3.220000e-07
I've tried these 2 ways and both throw out Syntax error: invalid syntax
. Why?
dfout['EXCHANGE_RATIO'] = dfout['CURRENCY'].apply(lambda x: dfc.query('CURRENCY_SOURCE'==x)['EXCHANGE_RATIO'] if x != 'EUR')
dfout['EXCHANGE_RATIO'] = dfout['CURRENCY'].apply(lambda x: dfc.loc[dfc['CURRENCY_SOURCE'] == x, 'EXCHANGE_RATIO'].iloc[-1] if x != 'EUR')
you can use the map
method:
dfout['EXCHANGE_RATIO'] = dfout['CURRENCY'] \
.map(dict(zip(dfc['CURRENCY_SOURCE'], dfc['EXCHANGE_RATIO'])))
for example with a dfout
like this:
DATE_PROCESS BOOKING_ID DEP_AIRPORT ARR_AIRPORT DEPARTURE_DATE ARRIVAL_DATE PRICE CURRENCY
0 2013-04-19 16:04:13 UTC 76969972 AEL DEL 2013-04-18 00:00:00 NaN 409.04 EUR
1 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 EUR
2 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 TRL
3 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 VES
you would get the following output:
DATE_PROCESS BOOKING_ID DEP_AIRPORT ARR_AIRPORT DEPARTURE_DATE ARRIVAL_DATE PRICE CURRENCY EXCHANGE_RATIO
0 2013-04-19 16:04:13 UTC 76969972 AEL DEL 2013-04-18 00:00:00 NaN 409.04 EUR NaN
1 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 EUR NaN
2 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 TRL 9.900000e-08
3 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 VES 3.220000e-07
and if you want to replace those NaN
you can use fillna()
:
dfout['EXCHANGE_RATIO'] = dfout['CURRENCY'] \
.map(dict(zip(dfc['CURRENCY_SOURCE'], dfc['EXCHANGE_RATIO']))) \
.fillna(1) # or whatever you want there
Corrected the query
syntax and added else
to make your code work. You have to call the x
with the @
.
dfout['EXCHANGE_RATIO'] = dfout['CURRENCY'].apply(lambda x: dfc.query('CURRENCY_SOURCE==@x')['EXCHANGE_RATIO'][0] if x != 'EUR' else np.NaN)
dfout - input
DATE_PROCESS BOOKING_ID DEP_AIRPORT ARR_AIRPORT DEPARTURE_DATE ARRIVAL_DATE PRICE CURRENCY
0 2013-04-19 16:04:13 UTC 76969972 AEL DEL 2013-04-18 00:00:00 NaN 409.04 EUR
1 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 EUR
2 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 TRL
dfc
CURRENCY_SOURCE CURRENCY_TARGET EXCHANGE_RATIO
0 TRL EUR 9.900000e-08
1 VES EUR 3.220000e-07
dfout - Output
DATE_PROCESS BOOKING_ID DEP_AIRPORT ARR_AIRPORT DEPARTURE_DATE ARRIVAL_DATE PRICE CURRENCY EXCHANGE_RATIO
0 2013-04-19 16:04:13 UTC 76969972 AEL DEL 2013-04-18 00:00:00 NaN 409.04 EUR NaN
1 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 EUR NaN
2 2014-04-17 02:26:46 UTC 76888867 ARP ZAL 2014-04-19 00:00:00 NaN 280.70 TRL 9.900000e-08
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.