简体   繁体   中英

How do I perform a JOIN if one of multiple conditions is true?

My end goal... SELECT all fields from qry1 on rows where either the home phone, cell phone, OR work phone match tbl2

This is my current SQL syntax that's "failing." By failing I mean it hasn't finished executing after 10-15 minutes while the individual joins each (independently) run in a minute or two a piece.

SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON ((qry1.CellPhone = tbl2.CellPhone) OR (qry1.HomePhone = tbl2.HomePhone) OR (qry1.WorkPhone = tbl2.WorkPhone));

Questions: Are there errors in my SQL syntax? Is there a better way to complete my task? Would it make more sense for me to simply run the 3 (Home, work, cell) JOIN queries independently, UNION them, and then dedupe if necessary?

If the individual joins work fast, the composite OR condition is probably slow because it cannot use a single index whereas the individual conditions can each use a single index for the three joining conditions. Because it can't use one index, it is probably doing non-indexed sequential table scans. (You should study the query plan so you understand what the optimizer is actually doing.)

Given that the individual queries work reasonably fast, therefore, you should get dramatically better performance using UNION (unless the optimizer in your DBMS has a blind spot):

SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON qry1.CellPhone = tbl2.CellPhone
UNION
SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON qry1.HomePhone = tbl2.HomePhone
UNION
SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON qry1.WorkPhone = tbl2.WorkPhone

That should give you a result approximately as quickly as the 3 individual queries. It won't be quite as fast because UNION does duplicate elimination (which the individual queries do not, of course). You could use UNION ALL, but if there are many rows in the two tables where 2 or 3 of the pairs of fields match, that could lead to a lot of repetition in the results.

or can be rather difficult for SQL optimizers. I would suggest 3 indexes and the folllowing query:

SELECT qry1.*
FROM qry1
WHERE EXISTS (SELECT 1 FROM tbl2 WHERE qry1.CellPhone = tbl2.CellPhone) OR
      EXISTS (SELECT 1 FROM tbl2 WHERE qry1.HomePhone = tbl2.HomePhone) OR
      EXISTS (SELECT 1 FROM tbl2 WHERE qry1.WorkPhone = tbl2.WorkPhone);

The three indexes are tbl2(CellPhone) , tbl2(HomePhone) , and tbl2(WorkPhone) .

If this is essentially a self join, where the same people exist in both tbl1 and qry1, then you will probably not be interested where people match themselves, as well as duplicate permutations of the results. What I mean, is self joins usually have two records where Bob and Jane have the same phone number, and thus your result has joined records of:

Bob, Jane
Jane, Bob
Jane, Jane
Bob, Bob

Out of those you probably only need Bob, Jane since it tells you that those two people have a matching phone number. If you have some unique ID then adding where qry1.SSN < tbl2.SNN will eliminate many of these duplicates and reduce your result set. Preferably this column would have an index.

The following is similar to https://stackoverflow.com/a/27696729/4350148 but uses disjoint union. May be more efficient

 SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON qry1.CellPhone = tbl2.CellPhone
 UNION distinct
 SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON  
    qry1.CellPhone != tbl2.CellPhone 
    and qry1.HomePhone = tbl2.HomePhone
 UNION distinct
 SELECT qry1.* FROM qry1 INNER JOIN tbl2 ON 
    qry1.CellPhone != tbl2.CellPhone 
    and qry1.HomePhone != tbl2.HomePhone
    and qry1.WorkPhone = tbl2.WorkPhone

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM