繁体   English   中英

在2个数据框之间的熊猫中进行vlookup创建第三个数据框

[英]vlookup in pandas between 2 dataframes to create third dataframe

我想将等效于excel的vlookup用于新的数据框。 我有2个数据帧,并尝试对df2.Column A and B进行v-lookup df1.Column A value并获取Value A

旁边的单元格是df1.Column A相对df2.Column A and B value B df2.Column A and B并获得value B

数据看起来像-

数据分别位于数据帧1和2的列A和B中

Current ouput                                                                  
    Data frame 1              Dataframe2     

      AC1     AC2               AC10                 AC20                                
    Bus        5              car                      1                                    
    car        3              helicopter               7                                  
    Walking    2              running                  5                                  

期望/预期输出

           Dataframe [Neu]    

NaNa                       NaNa    
Car                           1     
NaNa                       NaNa

我努力了:

dfz = df1.insert(2, '2A2', df1['AC1'].map(df2.set_index('AC1')['2A2']))
print (dfz)

result = left.join(right, on=['AC2', 'AC1], how='inner')
#left.join(right, lsuffix='_l', rsuffix='_r')

#df1.join(df1.set_index('AC2')['AC1'], on='AC2')

我在以下方面取得了一些成功:

df8 = df1['AC3'] = df1.AC1.map(df2.AC10)
print (df8)


df8 = df1['AC4'] = df1.AC1.map(df2.AC20)
print (df8)

确切的输出是NaN,所以它是不正确的。

例:

df1 = pd.read_excel('C:/Users/Desktop/zav.xlsx')

df2 = pd.read_excel('C:/Users/Desktop/zav2.xlsx')

#df3 = pd.merge(df, df2)
df3 = df1.join(df2)
print (df3)


todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=10, freq='D')

df5 = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)),
                 columns=['a', 'b', 'c', 'd', 'e'])
print(df5)


df8 = df1['AC3'] = df1.AC1.map(df2.AC10)
print (df8)


df8 = df1['AC3'] = df1.AC1.map(df2.AC20)
print (df8)

您可以检查以下使用map代码:

import pandas as pd    

df1 = pd.DataFrame([["Bus",5],["car",3],["Walking",2]],columns=["AC1","AC2"])

df2 = pd.DataFrame([["car",1],["helicopter",7],["running", 5]],columns=["AC10","AC20"])

df2 = df2.groupby("AC10").first()

df3= df1.join(df2,on="AC1",how="left").drop("AC2",axis=1)

它将输出以下内容:

       AC1  AC20
0      Bus   NaN
1      car   1.0
2  Walking   NaN

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM