简体   繁体   English

如果联接的值在特定集中,则在JOIN查询中使用LIKE

[英]Using LIKE in JOIN query if values for join are in specfic set

I have two sets: 我有两套:

  1. dictionary of names 名称字典
  2. "names" extracted from emails 从电子邮件中提取的“名称”

I want to find out if "names" from set 2. are similar/contains any of names in set 1 for example: 我想找出集合2中的“名称”是否类似/包含集合1中的任何名称,例如:

  • johnsmith contains john so it will be ok. johnsmith包含john,所以可以。

How can I do this in HIVE SQL? 如何在HIVE SQL中执行此操作?

select a.*, b.name as name_from_email
from set2 a left join 
     set1 b 
     on a.email_name rlike concat('%',b.name,'%')

I got error: " Both left and right aliases encountered in JOIN ''%'':28:27, org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:323, " 我收到错误消息:“ JOIN''%''中同时遇到左右别名:28:27,org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:323,“

You can use instr like following: 您可以像下面这样使用instr:

on(1=1) where instr(a.email_name, b.name) != 0

I know this is painful but this will solve the problem 我知道这很痛苦,但这可以解决问题

Cheers!! 干杯!!

You can rather painfully do this to get matches: 您可以痛苦地做到这一点来获得匹配:

select a.*, b.name as name_from_email
from set2 a cross join 
     set1 b 
where a.email_name rlike concat('%', b.name, '%');

Getting the additional non-matched rows in is a pain. 获取其他不匹配的行是很痛苦的。 I think this should work: 我认为这应该工作:

select a.*, b.name as name_from_email
from set2 a cross join 
     set1 b 
where a.email_name rlike concat('%', b.name, '%');
union all
select a.*, null
from set2 a
where not exists (select 1
                  from set1 b
                  where a.email_name rlike concat('%', b.name, '%')
                 );

Note: You may find that this compiles but does not actually run. 注意:您可能会发现该程序可以编译,但实际上无法运行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM