简体   繁体   English

使用apply()合并熊猫DataFrame以仅在两列的部分匹配中进行合并

[英]Merge Pandas DataFrame using apply() to only merge on partial match in two columns

I need to merge two pandas DataFrames but not only on exact column values, but also on approximate ones. 我需要合并两个pandas DataFrame,但不仅要合并确切的列值,还要合并近似的值。

For example, I have these two DataFrames: 例如,我有以下两个DataFrame:

import pandas as pd
d = {'col1': ["a", "b", "c", "d"], 'col2': [3, 4, 66, 120]}
df = pd.DataFrame(data=d)

    col1    col2
0   a       3
1   b       4
2   c       66
3   d       120

d2 = {'col1a': ["aa", "bb", "cc", "dd"], 'col2b': [3, 4, 67, 100]}
df2 = pd.DataFrame(data=d2)
    col1a   col2b
0   aa      3
1   bb      4
2   cc      67
3   dd      100

Now, if I simply join them on col2 and col2b columns, I will only get two rows where the column values are exactly the same. 现在,如果我只是将它们连接到col2col2b列上,那么我将仅获得两列值完全相同的行。

pd.merge(df, df2, how='inner', left_on='col2', right_on='col2b')
    col1    col2    col1a   col2b
0   a       3       aa      3
1   b       4       bb      4

Now, say for the simplicity of an example, I also want to merge column values based on the integer that is either +1 or -1 of the integer value from the left DataFrame. 现在,为简单起见,我还想基于来自左侧DataFrame的整数值的+1或-1的整数合并列值。 In our example in the left DataFrame the value 66 should be matched to 67 to the value from the right DataFrame in addition to the rows with values 3 and 4 : 在我们的示例中,除了具有值34的行之外,左侧的DataFrame中的值66还应与右侧的DataFrame中的值67匹配:

        col1    col2    col1a   col2b
    0   a       3       aa      3
    1   b       4       bb      4
    2   c       66      cc      67

Not sure how to approach this problem, somehow would need to merge based on the approximated column values using apply() ? 不确定如何解决此问题,是否需要使用apply()基于近似的列值进行合并?

Here is one way from merge_asof 这是来自merge_asof一种方法

pd.merge_asof(df,df2,left_on='col2',right_on='col2b',tolerance = 1,direction ='nearest').dropna()
Out[7]: 
  col1  col2 col1a  col2b
0    a     3    aa    3.0
1    b     4    bb    4.0
2    c    66    cc   67.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM