如果联接的值在特定集中，则在JOIN查询中使用LIKE

Question

I have two sets: 我有两套：

dictionary of names 名称字典
"names" extracted from emails 从电子邮件中提取的“名称”

I want to find out if "names" from set 2. are similar/contains any of names in set 1 for example: 我想找出集合2中的“名称”是否类似/包含集合1中的任何名称，例如：

johnsmith contains john so it will be ok. johnsmith包含john，所以可以。

How can I do this in HIVE SQL? 如何在HIVE SQL中执行此操作？

select a.*, b.name as name_from_email
from set2 a left join 
     set1 b 
     on a.email_name rlike concat('%',b.name,'%')

I got error: " Both left and right aliases encountered in JOIN ''%'':28:27, org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:323, " 我收到错误消息：“ JOIN''％''中同时遇到左右别名：28：27，org.apache.hive.service.cli.operation.Operation：toSQLException：Operation.java：323，“

Answer 1

You can use instr like following: 您可以像下面这样使用instr：

on(1=1) where instr(a.email_name, b.name) != 0

I know this is painful but this will solve the problem 我知道这很痛苦，但这可以解决问题

Cheers!! 干杯！！

Answer 2

You can rather painfully do this to get matches: 您可以痛苦地做到这一点来获得匹配：

select a.*, b.name as name_from_email
from set2 a cross join 
     set1 b 
where a.email_name rlike concat('%', b.name, '%');

Getting the additional non-matched rows in is a pain. 获取其他不匹配的行是很痛苦的。 I think this should work: 我认为这应该工作：

select a.*, b.name as name_from_email
from set2 a cross join 
     set1 b 
where a.email_name rlike concat('%', b.name, '%');
union all
select a.*, null
from set2 a
where not exists (select 1
                  from set1 b
                  where a.email_name rlike concat('%', b.name, '%')
                 );

Note: You may find that this compiles but does not actually run. 注意：您可能会发现该程序可以编译，但实际上无法运行。

如果联接的值在特定集中，则在JOIN查询中使用LIKE

问题描述

2 个解决方案

解决方案1
0 已采纳 2019-08-26 12:03:05

解决方案2
0 2019-08-26 12:05:54

如果联接的值在特定集中，则在JOIN查询中使用LIKE

问题描述

2 个解决方案

解决方案1 0 已采纳 2019-08-26 12:03:05

解决方案2 0 2019-08-26 12:05:54

解决方案1
0 已采纳 2019-08-26 12:03:05

解决方案2
0 2019-08-26 12:05:54