简体   繁体   English

使用 join 在 Pandas 中进行 vlookup

[英]vlookup in Pandas using join

I have the following 2 dataframes我有以下 2 个数据框

Example1
sku loc flag  
122  61 True 
123  61 True
113  62 True 
122  62 True 
123  62 False
122  63 False
301  63 True 

Example2 
sku dept 
113 a
122 b
123 b
301 c 

I want to perform a merge, or join opertation using Pandas (or whichever Python operator is best) to produce the below data frame.我想使用 Pandas(或任何最好的 Python 运算符)执行合并或连接操作以生成以下数据框。

Example3
sku loc flag   dept  
122  61 True   b
123  61 True   b
113  62 True   a
122  62 True   b
123  62 False  b
122  63 False  b
301  63 True   c

Both 
df_Example1.join(df_Example2,lsuffix='_ProdHier')
df_Example1.join(df_Example2,how='outer',lsuffix='_ProdHier')

Aren't working.不工作。 What am I doing wrong?我究竟做错了什么?

Perform a left merge, this will use sku column as the column to join on:执行left合并,这将使用sku列作为要加入的列:

In [26]:

df.merge(df1, on='sku', how='left')
Out[26]:
   sku  loc   flag dept
0  122   61   True    b
1  122   62   True    b
2  122   63  False    b
3  123   61   True    b
4  123   62  False    b
5  113   62   True    a
6  301   63   True    c

If sku is in fact your index then do this:如果sku实际上是您的索引,请执行以下操作:

In [28]:

df.merge(df1, left_index=True, right_index=True, how='left')
Out[28]:
     loc   flag dept
sku                 
113   62   True    a
122   61   True    b
122   62   True    b
122   63  False    b
123   61   True    b
123   62  False    b
301   63   True    c

Another method is to use map , if you set sku as the index on your second df, so in effect it becomes a Series then the code simplifies to this:另一种方法是使用map ,如果您将sku设置为第二个 df 的索引,那么实际上它变成了一个系列,那么代码简化为:

In [19]:

df['dept']=df.sku.map(df1.dept)
df
Out[19]:
   sku  loc   flag dept
0  122   61   True    b
1  123   61   True    b
2  113   62   True    a
3  122   62   True    b
4  123   62  False    b
5  122   63  False    b
6  301   63   True    c

A more generic application would be to use apply and lambda as follows:一个更通用的应用程序是使用applylambda如下:

dict1 = {113:'a',
         122:'b',
         123:'b',
         301:'c'}

df = pd.DataFrame([['1', 113],
                   ['2', 113],
                   ['3', 301],
                   ['4', 122],
                   ['5', 113]], columns=['num', 'num_letter'])

Add as a new dataframe column添加为新的数据框列

 **df['letter'] = df['num_letter'].apply(lambda x: dict1[x])**

  num  num_letter letter
0   1         113      a
1   2         113      a
2   3         301      c
3   4         122      b
4   5         113      a

OR replace the existing ('num_letter') column或替换现有的 ('num_letter') 列

 **df['num_letter'] = df['num_letter'].apply(lambda x: dict1[x])**

  num num_letter
0   1          a
1   2          a
2   3          c
3   4          b
4   5          a

VLookup in VBA is just like pandas.dataframe.merge VBA 中的 VLookup 就像 pandas.dataframe.merge

I always look for so many procedures for VBA in the past and now python dataframe saves me a ton of work, good thing is I don't need write a vlookup method.过去我总是为 VBA 寻找这么多程序,现在 python dataframe 为我节省了大量工作,好消息是我不需要编写 vlookup 方法。

pandas.DataFrame.merge pandas.DataFrame.merge

>>> A              >>> B
    lkey value         rkey value
0   foo  1         0   foo  5
1   bar  2         1   bar  6
2   baz  3         2   qux  7
3   foo  4         3   bar  8
>>> A.merge(B, left_on='lkey', right_on='rkey', how='outer')
   lkey  value_x  rkey  value_y
0  foo   1        foo   5
1  foo   4        foo   5
2  bar   2        bar   6
3  bar   2        bar   8
4  baz   3        NaN   NaN
5  NaN   NaN      qux   7

You can also try the following to do a left merge.您也可以尝试以下方法进行左合并。

import pandas as pd
pd.merge(left, right, left_on = 'key', right_on = 'key', how='left')

outer or left act like SQL, python's built-in class DataFrame has the method merge taking many args, which is very detailed and handy.的行为,如SQL,Python的内置类数据帧有法合并采取了许多指定参数时,这是非常详细和方便。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM