简体   繁体   中英

Python: In Pandas extract data from several columns in a dataframe based on a condition and add to different dataframe matching on a column

I have a large data set in the following format:

import pandas as pd
df1 = pd.DataFrame({'Date':['01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020'],
           'From': ['RU', 'RU', 'RU','USA', 'USA', 'USA','ME', 'ME', 'ME'],
           'To': ['JK', 'JK', 'JK','JK', 'JK', 'JK','JK', 'JK', 'JK'],
           'Distance':[ 40000, 40000, 40000,30000, 30000, 30000,20000, 20000, 20000],
           'Days': [8,8,8,6,6,6,4,4,4]})

df2 = pd.DataFrame({'Date':['01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020'],
           'Contract': ['OrangeTier', 'OrangeTier', 'OrangeTier','AppleTier', 'AppleTier', 'AppleTier','GrapeTier', 'GrapeTier', 'GrapeTier'],
           'Price':[ 10000, 15000, 20000,30000, 35000, 1000,45000, 20000, 21000]})

I would like to add a column to df1, which looks up the Contract 'OrangieTier', matches the dates in df1 with df2 and returns the price. Resulting in the dataframe looking something like this:

df1 = pd.DataFrame({'Date':['01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020','01/02/2020' , '01/03/2020', '01/04/2020'],
           'From': ['RU', 'RU', 'RU','USA', 'USA', 'USA','ME', 'ME', 'ME'],
           'To': ['JK', 'JK', 'JK','JK', 'JK', 'JK','JK', 'JK', 'JK'],
           'Distance':[ 40000, 40000, 40000,30000, 30000, 30000,20000, 20000, 20000],
           'OrangeTier':[10000, 15000, 20000,10000, 15000, 20000,10000, 15000, 20000],
           'Days': [8,8,8,6,6,6,4,4,4]})

I then want to multiply OrangeTier by Days and overwrite the OrangTier column with the result.

Let's try:

mapper = df2.query('Contract == "OrangeTier"').set_index(['Date'])['Price']

df1['OrangeTier'] = df1['Date'].map(mapper)

df1.assign(OrangeTier=df1['OrangeTier'] * df1['Days'])

Output:

         Date From  To  Distance  Days  OrangeTier
0  01/02/2020   RU  JK     40000     8       80000
1  01/03/2020   RU  JK     40000     8      120000
2  01/04/2020   RU  JK     40000     8      160000
3  01/02/2020  USA  JK     30000     6       60000
4  01/03/2020  USA  JK     30000     6       90000
5  01/04/2020  USA  JK     30000     6      120000
6  01/02/2020   ME  JK     20000     4       40000
7  01/03/2020   ME  JK     20000     4       60000
8  01/04/2020   ME  JK     20000     4       80000

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM