简体   繁体   中英

How do you perform a join to a table with “OR” conditions?

I have a SQL statement that I can't seem to solve for... I'm not sure how to perform an "OR" on my join. In fact, I'm not sure if I should even be doing a join at all... Here is what I have so far:

SELECT o.* FROM dbo.Orders o
    INNER JOIN dbo.Transactions t1 ON t1.OrderId = o.OrderId
                                  AND t1.Code = 'TX33'
    INNER JOIN dbo.Transactions t2 ON t2.OrderId = o.OrderId
                                  AND t2.Code = 'TX34'
WHERE o.PurchaseDate NOT NULL

I haven't ran this yet, but I assume that this will get me all orders that have a purchase date that also have BOTH TX33 and TX34 transactions. Any orders without both of those transactions won't show up (due to the INNER JOINs). The part I'm stuck at is this:

I need to be able to also ensure that the order also contains either:

  • TX35 AND TX36
  • TX37
  • TX38 AND TX39

Only one of those additional conditions is necessary. I know I can't simply INNER JOIN because that means it's required to be there. If I do a regular JOIN I could possibly make it work if one of the 'OR' conditions wasn't itself an 'AND' condition (I'm not sure how to do TX35 AND TX36 as one JOIN condition, nor TX38 AND TX39 .

The selection logic needs to be in the WHERE clause. Perhaps something like this:

SELECT o.* FROM dbo.Orders AS o, dbo.Transactions AS t1, dbo.Transactions AS t2
WHERE t1.OrderId = o.OrderId AND t2.OrderId = o.OrderId
AND o.PurchaseDate NOT NULL
AND (
  (t1.Code = 'TX33' AND t2.Code = 'TX34') OR 
  (t1.Code = 'TX35' AND t2.Code = 'TX36') OR 
  (t1.Code = 'TX37') OR 
  (t1.Code = 'TX38' AND t2.Code = 'TX39') 
)

In the case where you need four independent selection criteria, you will need to JOIN the table four times, for example:

SELECT o.* FROM dbo.Orders AS o, dbo.Transactions AS t1, dbo.Transactions AS t2
WHERE t1.OrderId = o.OrderId AND t2.OrderId = o.OrderId 
AND t3.OrderId = o.OrderId AND t4.OrderId = o.OrderId
AND o.PurchaseDate NOT NULL
AND (t1.Code = 'TX33' AND t2.Code = 'TX34')
AND ( 
  (t3.Code = 'TX35' AND t4.Code = 'TX36') OR 
  (t3.Code = 'TX37') OR 
  (t3.Code = 'TX38' AND t4.Code = 'TX39') 
)

You can have complex condition in an ON clause. Using a LEFT OUTER JOIN allows you to handle the odd case (TX37) in the WHERE clause.

Note that references to R in the WHERE clause must handle NULLs to avoid converting the outer join to an inner join.

select L.*
  from dbo.Orders as L left outer join
    dbo.Orders as R on R.OrderId = L.OrderId and (
      ( L.Code = 'TX33' and R.Code = 'TX34' ) or
      ( L.Code = 'TX35' and R.Code = 'TX36' ) or
      ( L.Code = 'TX38' and R.Code = 'TX39' ) )
  where L.PurchaseDate is not NULL and ( L.Code = 'TX37' or R.Code is not NULL )

If you really want only orders that contain TX33, TX34 and one or more of the other patterns then it is a little more complicated. Using group by L.OrderId with a count( L.OrderId ) lets you find orders that have, say, two or more matches among the patterns. It begins to approach something like this:

declare @Orders as Table ( Id Int Identity, OrderId Int, Code VarChar(4), PurchaseDate Date )
insert into @Orders ( OrderId, Code, PurchaseDate ) values
  ( 1, 'TX37', GetDate() ),
  ( 2, 'TX37', GetDate() ), ( 2, 'FOO', GetDate() ),
  ( 3, 'TX33', GetDate() ), ( 3, 'TX34', GetDate() ),
  ( 4, 'TX33', GetDate() ), ( 4, 'TX34', GetDate() ), ( 4, 'TX37', GetDate() ),
  ( 5, 'TX33', GetDate() ), ( 5, 'TX34', GetDate() ), ( 5, 'TX35', GetDate() ),
    ( 5, 'TX36', GetDate() ),
  ( 6, 'TX33', GetDate() ), ( 6, 'TX34', GetDate() ), ( 6, 'TX35', GetDate() ),
    ( 6, 'TX36', GetDate() ), ( 6, 'TX37', GetDate() ),
  ( 7, 'TX38', GetDate() ), ( 7, 'TX39', GetDate() ), ( 7, 'TX35', GetDate() ),
    ( 7, 'TX36', GetDate() ), ( 7, 'TX37', GetDate() )

select * from (
  select L.OrderId,
    Max( case when L.Code = 'TX33' and R.Code = 'TX34' then 1 else 0 end ) as Mandatory,
    Count( L.OrderId ) as Matches
    from @Orders as L left outer join
      @Orders as R on R.OrderId = L.OrderId and (
        ( L.Code = 'TX33' and R.Code = 'TX34' ) or
        ( L.Code = 'TX35' and R.Code = 'TX36' ) or
        ( L.Code = 'TX38' and R.Code = 'TX39' ) )
    where L.PurchaseDate is not NULL and ( L.Code = 'TX37' or R.Code is not NULL )
    group by L.OrderId ) as Arnold
  where Mandatory = 1 and Matches > 1
SELECT o.* 
FROM dbo.Orders o
WHERE EXISTS ( SELECT *   FROM dbo.Transactions t1 
               WHERE t1.OrderId = o.OrderId   AND t1.Code = 'TX33'
             )
  AND EXISTS ( SELECT *   FROM dbo.Transactions t2 
               WHERE t2.OrderId = o.OrderId   AND t2.Code = 'TX34'
             )
  AND
    (     EXISTS ( SELECT *   FROM dbo.Transactions t1 
                   WHERE t1.OrderId = o.OrderId   AND t1.Code = 'TX35'
                 )
      AND EXISTS ( SELECT *   FROM dbo.Transactions t2 
                   WHERE t2.OrderId = o.OrderId   AND t2.Code = 'TX36'

    OR  EXISTS ( SELECT *   FROM dbo.Transactions t 
                 WHERE t.OrderId = o.OrderId    AND t.Code = 'TX37'
               )

    OR    EXISTS ( SELECT *   FROM dbo.Transactions t1 
                   WHERE t1.OrderId = o.OrderId   AND t1.Code = 'TX38'
                 )
      AND EXISTS ( SELECT *   FROM dbo.Transactions t2 
                   WHERE t2.OrderId = o.OrderId   AND t2.Code = 'TX39'
                 )
    ) ;

You could also write it like this:

SELECT o.* 
FROM dbo.Orders o
  JOIN
    ( SELECT OrderId
      FROM dbo.Transactions
      WHERE Code IN ('TX33', 'TX34', 'TX35', 'TX36', 'TX37', 'TX38', 'TX39')
      GROUP BY OrderId
      HAVING COUNT(DISTINCT CASE WHEN Code = 'TX33' THEN Code END) = 1
         AND COUNT(DISTINCT CASE WHEN Code = 'TX34' THEN Code END) = 1
         AND ( COUNT(DISTINCT 
                     CASE WHEN Code IN ('TX35', 'TX36') THEN Code END) = 2
            OR COUNT(DISTINCT CASE WHEN Code = 'TX37' THEN Code END) = 1
            OR COUNT(DISTINCT 
                     CASE WHEN Code IN ('TX38', 'TX39') THEN Code END) = 2
             ) 
    ) t
    ON t.OrderId = o.OrderId ;

After fiddling with it for a while, I think I have achieved your goal with the following query:

SELECT * FROM (
    SELECT o.*, t3.Code as t3 FROM dbo.Orders o
        INNER JOIN dbo.Transactions t1 ON t1.OrderId = o.OrderId
                                      AND t1.Code = 'TX33'
        INNER JOIN dbo.Transactions t2 ON t2.OrderId = o.OrderId
                                      AND t2.Code = 'TX34'
        INNER JOIN dbo.Transactions t3 ON t3.OrderId = o.OrderId
                                      AND (
                                           t3.Code = 'TX35' OR 
                                           t3.Code = 'TX37' OR
                                           t3.Code = 'TX38'
                                          )
    ) WHERE t3 = 'TX37'
        OR (t3 = 'TX36' AND EXISTS (SELECT t.Code FROM dbo.Transactions t WHERE t.OrderId = o.OrderId AND t.Code = 'TX36'))
        OR (t3 = 'TX38' AND EXISTS (SELECT t.Code FROM dbo.Transactions t WHERE t.OrderId = o.OrderId AND t.Code = 'TX39'))

The inner SELECT should return only Orders linked to Transactions with codes TX34, TX35, and either TX35, TX37 or TX38. We keep a copy of this last code in the results.

Then we have to further narrow down the list, by keeping Orders whose 3rd code was either TX37 (no further conditions needed) or orders which have the remaining code associated to them.

I think this approach should perform better than joining in the Transactions table four times without filtering it first: it should require O*(T+T+T)*(T+T) = 6*O*T^2 iterations, whereas the four-unfiltered-joins approach would require O*T*T*T*T = O*T^4 iterations.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM