简体   繁体   中英

vlookup in pandas between 2 dataframes to create third dataframe

I am wanting to use the equivalent to excel's vlookup for a new dataframe. I have 2 dataframes and am trying to v-lookup df1.Column A value against df2.Column A and B and getting Value A .

And the cell beside that is df1.Column A value against df2.Column A and B and getting value B .

Data looks like-

The data is in Columns A and B respectively for both data frames 1 and 2

Current ouput                                                                  
    Data frame 1              Dataframe2     

      AC1     AC2               AC10                 AC20                                
    Bus        5              car                      1                                    
    car        3              helicopter               7                                  
    Walking    2              running                  5                                  

Desired/Expected output

           Dataframe [Neu]    

NaNa                       NaNa    
Car                           1     
NaNa                       NaNa

I have tried:

dfz = df1.insert(2, '2A2', df1['AC1'].map(df2.set_index('AC1')['2A2']))
print (dfz)

result = left.join(right, on=['AC2', 'AC1], how='inner')
#left.join(right, lsuffix='_l', rsuffix='_r')

#df1.join(df1.set_index('AC2')['AC1'], on='AC2')

I have had some success with:

df8 = df1['AC3'] = df1.AC1.map(df2.AC10)
print (df8)


df8 = df1['AC4'] = df1.AC1.map(df2.AC20)
print (df8)

The exact output is NaN so it's not correct.

Example:

df1 = pd.read_excel('C:/Users/Desktop/zav.xlsx')

df2 = pd.read_excel('C:/Users/Desktop/zav2.xlsx')

#df3 = pd.merge(df, df2)
df3 = df1.join(df2)
print (df3)


todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=10, freq='D')

df5 = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)),
                 columns=['a', 'b', 'c', 'd', 'e'])
print(df5)


df8 = df1['AC3'] = df1.AC1.map(df2.AC10)
print (df8)


df8 = df1['AC3'] = df1.AC1.map(df2.AC20)
print (df8)

You can check the following code working with map :

import pandas as pd    

df1 = pd.DataFrame([["Bus",5],["car",3],["Walking",2]],columns=["AC1","AC2"])

df2 = pd.DataFrame([["car",1],["helicopter",7],["running", 5]],columns=["AC10","AC20"])

df2 = df2.groupby("AC10").first()

df3= df1.join(df2,on="AC1",how="left").drop("AC2",axis=1)

It will output the following:

       AC1  AC20
0      Bus   NaN
1      car   1.0
2  Walking   NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM