简体   繁体   English

在具有两个不同公差的两列上合并,pandas

[英]merge on two columns with two different tollerances, pandas

my input:我的输入:

df1=pd.DataFrame(
    {
        'A':['my ','fire','water','earth','monkey'],
        'B':[1,5,7,8,9],
        'C':[100,105,110,182,140]
                 })
print(df1)
        A  B    C
0     my   1  100
1    fire  5  105
2   water  7  110
3   earth  8  182
4  monkey  9  140

df2=pd.DataFrame(
    {
        'A':['drop','hold','push','pull','keep'],
        'B':[1,4,4,10,10],
        'C':[103,102,133,124,142]
                 })
print(df2)
      A   B    C
0  drop   1  103
1  hold   4  102
2  push   4  133
3  pull  10  124
4  keep  10  142

I want to merge those two df's (df1 & df2) using pd.merge_asof() or any other way我想使用pd.merge_asof()或任何其他方式合并这两个 df (df1 & df2)

I can merge those two columns using one tollerance by: df= pd.merge_asof(df1,df2,on='B',direction='nearest',tolerance=2)我可以使用一个公差合并这两列: df= pd.merge_asof(df1,df2,on='B',direction='nearest',tolerance=2)

but I need to use two dirrefent tolerances for column B and C using: B_tol = 2, C_tol = 4但我需要对 B 列和 C 使用两个直接公差:B_tol = 2, C_tol = 4

Expected output:预期 output:

     A_x  B_x C_x   A_y  B_y    C_y  
0     my   1  100   drop   1    103 
1    fire  5  105   hold   4    102
2  monkey  9  140   keep  10    142

Expected output will have only one col B and C when I will use merge on those cols, this output only shows an example how it should work当我将在这些列上使用合并时,预计 output 将只有一个列 B 和 C,这个 output 仅显示了它应该如何工作的示例

can we discuss this solution?我们可以讨论这个解决方案吗? i donw want to paste it to comments我不想将其粘贴到评论中

df1=pd.DataFrame(
    {
        'A':['my ','fire','water','earth','monkey'],
        'B':[1,5,7,8,9],
        'C':[100,105,110,182,140]
                 })

df2=pd.DataFrame(
    {
        'A':['drop','hold','push','pull','keep'],
        'B':[1,4,4,10,10],
        'C':[103,102,133,124,142]
                 })

df_1 = pd.merge_asof(df1,df2,on='B',direction='nearest',tolerance=2)
df_2 = pd.merge_asof(df1.sort_values(by='C'),df2.sort_values(by='C'),on='C',direction='nearest',tolerance=4)
df_3 = pd.merge_asof(df2,df1,on='B',direction='nearest',tolerance=2)
df_4 = pd.merge_asof(df2.sort_values(by='C'),df1.sort_values(by='C'),on='C',direction='nearest',tolerance=4)
# df= pd.merge_asof(df3.sort_values(by='C_x'),df4.sort_values(by='C'),left_on='C_x',right_on='C',direction='nearest',tolerance=4).dropna()
df_12 = pd.merge(df_1,df_2,on='A_x').dropna()
df_34 = pd.merge(df_3,df_4,on='A_x').dropna()

print(df_12)
      A_x  B  C_x A_y_x    C_y  B_x    C A_y_y   B_y
0     my   1  100  drop  103.0    1  100  hold   4.0
1    fire  5  105  push  133.0    5  105  drop   1.0
4  monkey  9  140  pull  124.0    9  140  keep  10.0
print(df_34)
    A_x   B  C_x   A_y_x  C_y  B_x    C   A_y_y  B_y
0  drop   1  103     my   100    1  103    fire  5.0
1  hold   4  102    fire  105    4  102     my   1.0
4  keep  10  142  monkey  140   10  142  monkey  9.0
df = pd.merge(df_12,df_34,left_index=True,right_index=True)
print(df[['A_x_x','A_x_y']])
    A_x_x A_x_y
0     my   drop
1    fire  hold
4  monkey  keep

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM