简体   繁体   中英

Using LIKE in JOIN query if values for join are in specfic set

I have two sets:

  1. dictionary of names
  2. "names" extracted from emails

I want to find out if "names" from set 2. are similar/contains any of names in set 1 for example:

  • johnsmith contains john so it will be ok.

How can I do this in HIVE SQL?

select a.*, b.name as name_from_email
from set2 a left join 
     set1 b 
     on a.email_name rlike concat('%',b.name,'%')

I got error: " Both left and right aliases encountered in JOIN ''%'':28:27, org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:323, "

You can use instr like following:

on(1=1) where instr(a.email_name, b.name) != 0

I know this is painful but this will solve the problem

Cheers!!

You can rather painfully do this to get matches:

select a.*, b.name as name_from_email
from set2 a cross join 
     set1 b 
where a.email_name rlike concat('%', b.name, '%');

Getting the additional non-matched rows in is a pain. I think this should work:

select a.*, b.name as name_from_email
from set2 a cross join 
     set1 b 
where a.email_name rlike concat('%', b.name, '%');
union all
select a.*, null
from set2 a
where not exists (select 1
                  from set1 b
                  where a.email_name rlike concat('%', b.name, '%')
                 );

Note: You may find that this compiles but does not actually run.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM