[英]merge on two columns with two different tollerances, pandas
my input:我的输入:
df1=pd.DataFrame(
{
'A':['my ','fire','water','earth','monkey'],
'B':[1,5,7,8,9],
'C':[100,105,110,182,140]
})
print(df1)
A B C
0 my 1 100
1 fire 5 105
2 water 7 110
3 earth 8 182
4 monkey 9 140
df2=pd.DataFrame(
{
'A':['drop','hold','push','pull','keep'],
'B':[1,4,4,10,10],
'C':[103,102,133,124,142]
})
print(df2)
A B C
0 drop 1 103
1 hold 4 102
2 push 4 133
3 pull 10 124
4 keep 10 142
I want to merge those two df's (df1 & df2) using pd.merge_asof()
or any other way我想使用
pd.merge_asof()
或任何其他方式合并这两个 df (df1 & df2)
I can merge those two columns using one tollerance by: df= pd.merge_asof(df1,df2,on='B',direction='nearest',tolerance=2)
我可以使用一个公差合并这两列:
df= pd.merge_asof(df1,df2,on='B',direction='nearest',tolerance=2)
but I need to use two dirrefent tolerances for column B and C using: B_tol = 2, C_tol = 4但我需要对 B 列和 C 使用两个直接公差:B_tol = 2, C_tol = 4
Expected output:预期 output:
A_x B_x C_x A_y B_y C_y
0 my 1 100 drop 1 103
1 fire 5 105 hold 4 102
2 monkey 9 140 keep 10 142
Expected output will have only one col B and C when I will use merge on those cols, this output only shows an example how it should work当我将在这些列上使用合并时,预计 output 将只有一个列 B 和 C,这个 output 仅显示了它应该如何工作的示例
can we discuss this solution?我们可以讨论这个解决方案吗? i donw want to paste it to comments
我不想将其粘贴到评论中
df1=pd.DataFrame(
{
'A':['my ','fire','water','earth','monkey'],
'B':[1,5,7,8,9],
'C':[100,105,110,182,140]
})
df2=pd.DataFrame(
{
'A':['drop','hold','push','pull','keep'],
'B':[1,4,4,10,10],
'C':[103,102,133,124,142]
})
df_1 = pd.merge_asof(df1,df2,on='B',direction='nearest',tolerance=2)
df_2 = pd.merge_asof(df1.sort_values(by='C'),df2.sort_values(by='C'),on='C',direction='nearest',tolerance=4)
df_3 = pd.merge_asof(df2,df1,on='B',direction='nearest',tolerance=2)
df_4 = pd.merge_asof(df2.sort_values(by='C'),df1.sort_values(by='C'),on='C',direction='nearest',tolerance=4)
# df= pd.merge_asof(df3.sort_values(by='C_x'),df4.sort_values(by='C'),left_on='C_x',right_on='C',direction='nearest',tolerance=4).dropna()
df_12 = pd.merge(df_1,df_2,on='A_x').dropna()
df_34 = pd.merge(df_3,df_4,on='A_x').dropna()
print(df_12)
A_x B C_x A_y_x C_y B_x C A_y_y B_y
0 my 1 100 drop 103.0 1 100 hold 4.0
1 fire 5 105 push 133.0 5 105 drop 1.0
4 monkey 9 140 pull 124.0 9 140 keep 10.0
print(df_34)
A_x B C_x A_y_x C_y B_x C A_y_y B_y
0 drop 1 103 my 100 1 103 fire 5.0
1 hold 4 102 fire 105 4 102 my 1.0
4 keep 10 142 monkey 140 10 142 monkey 9.0
df = pd.merge(df_12,df_34,left_index=True,right_index=True)
print(df[['A_x_x','A_x_y']])
A_x_x A_x_y
0 my drop
1 fire hold
4 monkey keep
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.