[英]pandas non equi join in range
I need to do 'pandas non equi join', when first table joined with second table in range.当第一个表与范围内的第二个表连接时,我需要执行“pandas non equi join”。
first_table first_table
EMPLOYEE_ID SALARY
100 3000.00
101 17000.00
102 17000.00
103 9000.00
104 6000.00
105 4800.00
106 4800.00
………….. …………
………………. …………
second_table
grade_id lowest_sal highest_sal grade_level
1 0 3500 GRADE-A
2 3501 7000 GRADE-B
3 7001 10000 GRADE-C
4 10000 20000 GRADE-D
Need_table(OUTPUT):
EMPLOYEE_ID SALARY grade_level
115 3000 GRADE-A
116 17000 GRADE-D
117 17000 GRADE-D
118 9000 GRADE-C
119 6000 GRADE-B
125 4800 GRADE-B
126 4800 GRADE-B
This equivalent SQL query as:这相当于 SQL 查询为:
SELECT f.EMPLOYEE_ID,
f.SALARY,
s.grade_level
FROM first_table f JOIN second_table s
ON f.SALARY BETWEEN s.lowest_sal AND s.highest_sal
Can't to use 'pd.merge' method to join tables because not have any common column.... Please help to find method无法使用“pd.merge”方法连接表,因为没有任何公共列....请帮助查找方法
Thanks谢谢
If df1
is your first table and df2
is your second table, you could do for example this:如果
df1
是你的第一个表而df2
是你的第二个表,你可以这样做:
d = df2.set_index('grade_level').to_dict('split')
df1['GRADE'] = df1['SALARY'].apply(
lambda x: next((c for i, c in enumerate(d['index']) if d['data'][i][1] <= x <= d['data'][i][2]), np.nan)
)
print(df1)
Prints:印刷:
EMPLOYEE_ID SALARY GRADE
0 100 3000.0 GRADE-A
1 101 17000.0 GRADE-D
2 102 17000.0 GRADE-D
3 103 9000.0 GRADE-C
4 104 6000.0 GRADE-B
5 105 4800.0 GRADE-B
6 106 4800.0 GRADE-B
One option is with conditional_join from pyjanitor , which avoids a cartesian join (helpful with memory, and performance, depending on the data size):一种选择是使用pyjanitor中的conditional_join ,它避免了笛卡尔连接(有助于 memory 和性能,具体取决于数据大小):
# pip install pyjanitor
import pandas as pd
import janitor
(first_table
.astype({'SALARY':int})
.conditional_join(
second_table,
('SALARY', 'lowest_sal', '>='),
('SALARY', 'highest_sal', '<='))
.loc[:, ['EMPLOYEE_ID', 'SALARY', 'grade_level']]
)
EMPLOYEE_ID SALARY grade_level
0 100 3000 GRADE-A
1 101 17000 GRADE-D
2 102 17000 GRADE-D
3 103 9000 GRADE-C
4 104 6000 GRADE-B
5 105 4800 GRADE-B
6 106 4800 GRADE-B
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.