I have a SQL statement that I can't seem to solve for... I'm not sure how to perform an "OR" on my join. In fact, I'm not sure if I should even be doing a join at all... Here is what I have so far:
SELECT o.* FROM dbo.Orders o
INNER JOIN dbo.Transactions t1 ON t1.OrderId = o.OrderId
AND t1.Code = 'TX33'
INNER JOIN dbo.Transactions t2 ON t2.OrderId = o.OrderId
AND t2.Code = 'TX34'
WHERE o.PurchaseDate NOT NULL
I haven't ran this yet, but I assume that this will get me all orders that have a purchase date that also have BOTH TX33 and TX34 transactions. Any orders without both of those transactions won't show up (due to the INNER JOINs). The part I'm stuck at is this:
I need to be able to also ensure that the order also contains either:
Only one of those additional conditions is necessary. I know I can't simply INNER JOIN because that means it's required to be there. If I do a regular JOIN I could possibly make it work if one of the 'OR' conditions wasn't itself an 'AND' condition (I'm not sure how to do TX35 AND TX36
as one JOIN condition, nor TX38 AND TX39
.
The selection logic needs to be in the WHERE
clause. Perhaps something like this:
SELECT o.* FROM dbo.Orders AS o, dbo.Transactions AS t1, dbo.Transactions AS t2
WHERE t1.OrderId = o.OrderId AND t2.OrderId = o.OrderId
AND o.PurchaseDate NOT NULL
AND (
(t1.Code = 'TX33' AND t2.Code = 'TX34') OR
(t1.Code = 'TX35' AND t2.Code = 'TX36') OR
(t1.Code = 'TX37') OR
(t1.Code = 'TX38' AND t2.Code = 'TX39')
)
In the case where you need four independent selection criteria, you will need to JOIN
the table four times, for example:
SELECT o.* FROM dbo.Orders AS o, dbo.Transactions AS t1, dbo.Transactions AS t2
WHERE t1.OrderId = o.OrderId AND t2.OrderId = o.OrderId
AND t3.OrderId = o.OrderId AND t4.OrderId = o.OrderId
AND o.PurchaseDate NOT NULL
AND (t1.Code = 'TX33' AND t2.Code = 'TX34')
AND (
(t3.Code = 'TX35' AND t4.Code = 'TX36') OR
(t3.Code = 'TX37') OR
(t3.Code = 'TX38' AND t4.Code = 'TX39')
)
You can have complex condition in an ON
clause. Using a LEFT OUTER JOIN
allows you to handle the odd case (TX37) in the WHERE
clause.
Note that references to R
in the WHERE
clause must handle NULLs to avoid converting the outer join to an inner join.
select L.*
from dbo.Orders as L left outer join
dbo.Orders as R on R.OrderId = L.OrderId and (
( L.Code = 'TX33' and R.Code = 'TX34' ) or
( L.Code = 'TX35' and R.Code = 'TX36' ) or
( L.Code = 'TX38' and R.Code = 'TX39' ) )
where L.PurchaseDate is not NULL and ( L.Code = 'TX37' or R.Code is not NULL )
If you really want only orders that contain TX33, TX34 and one or more of the other patterns then it is a little more complicated. Using group by L.OrderId
with a count( L.OrderId )
lets you find orders that have, say, two or more matches among the patterns. It begins to approach something like this:
declare @Orders as Table ( Id Int Identity, OrderId Int, Code VarChar(4), PurchaseDate Date )
insert into @Orders ( OrderId, Code, PurchaseDate ) values
( 1, 'TX37', GetDate() ),
( 2, 'TX37', GetDate() ), ( 2, 'FOO', GetDate() ),
( 3, 'TX33', GetDate() ), ( 3, 'TX34', GetDate() ),
( 4, 'TX33', GetDate() ), ( 4, 'TX34', GetDate() ), ( 4, 'TX37', GetDate() ),
( 5, 'TX33', GetDate() ), ( 5, 'TX34', GetDate() ), ( 5, 'TX35', GetDate() ),
( 5, 'TX36', GetDate() ),
( 6, 'TX33', GetDate() ), ( 6, 'TX34', GetDate() ), ( 6, 'TX35', GetDate() ),
( 6, 'TX36', GetDate() ), ( 6, 'TX37', GetDate() ),
( 7, 'TX38', GetDate() ), ( 7, 'TX39', GetDate() ), ( 7, 'TX35', GetDate() ),
( 7, 'TX36', GetDate() ), ( 7, 'TX37', GetDate() )
select * from (
select L.OrderId,
Max( case when L.Code = 'TX33' and R.Code = 'TX34' then 1 else 0 end ) as Mandatory,
Count( L.OrderId ) as Matches
from @Orders as L left outer join
@Orders as R on R.OrderId = L.OrderId and (
( L.Code = 'TX33' and R.Code = 'TX34' ) or
( L.Code = 'TX35' and R.Code = 'TX36' ) or
( L.Code = 'TX38' and R.Code = 'TX39' ) )
where L.PurchaseDate is not NULL and ( L.Code = 'TX37' or R.Code is not NULL )
group by L.OrderId ) as Arnold
where Mandatory = 1 and Matches > 1
SELECT o.*
FROM dbo.Orders o
WHERE EXISTS ( SELECT * FROM dbo.Transactions t1
WHERE t1.OrderId = o.OrderId AND t1.Code = 'TX33'
)
AND EXISTS ( SELECT * FROM dbo.Transactions t2
WHERE t2.OrderId = o.OrderId AND t2.Code = 'TX34'
)
AND
( EXISTS ( SELECT * FROM dbo.Transactions t1
WHERE t1.OrderId = o.OrderId AND t1.Code = 'TX35'
)
AND EXISTS ( SELECT * FROM dbo.Transactions t2
WHERE t2.OrderId = o.OrderId AND t2.Code = 'TX36'
OR EXISTS ( SELECT * FROM dbo.Transactions t
WHERE t.OrderId = o.OrderId AND t.Code = 'TX37'
)
OR EXISTS ( SELECT * FROM dbo.Transactions t1
WHERE t1.OrderId = o.OrderId AND t1.Code = 'TX38'
)
AND EXISTS ( SELECT * FROM dbo.Transactions t2
WHERE t2.OrderId = o.OrderId AND t2.Code = 'TX39'
)
) ;
You could also write it like this:
SELECT o.*
FROM dbo.Orders o
JOIN
( SELECT OrderId
FROM dbo.Transactions
WHERE Code IN ('TX33', 'TX34', 'TX35', 'TX36', 'TX37', 'TX38', 'TX39')
GROUP BY OrderId
HAVING COUNT(DISTINCT CASE WHEN Code = 'TX33' THEN Code END) = 1
AND COUNT(DISTINCT CASE WHEN Code = 'TX34' THEN Code END) = 1
AND ( COUNT(DISTINCT
CASE WHEN Code IN ('TX35', 'TX36') THEN Code END) = 2
OR COUNT(DISTINCT CASE WHEN Code = 'TX37' THEN Code END) = 1
OR COUNT(DISTINCT
CASE WHEN Code IN ('TX38', 'TX39') THEN Code END) = 2
)
) t
ON t.OrderId = o.OrderId ;
After fiddling with it for a while, I think I have achieved your goal with the following query:
SELECT * FROM (
SELECT o.*, t3.Code as t3 FROM dbo.Orders o
INNER JOIN dbo.Transactions t1 ON t1.OrderId = o.OrderId
AND t1.Code = 'TX33'
INNER JOIN dbo.Transactions t2 ON t2.OrderId = o.OrderId
AND t2.Code = 'TX34'
INNER JOIN dbo.Transactions t3 ON t3.OrderId = o.OrderId
AND (
t3.Code = 'TX35' OR
t3.Code = 'TX37' OR
t3.Code = 'TX38'
)
) WHERE t3 = 'TX37'
OR (t3 = 'TX36' AND EXISTS (SELECT t.Code FROM dbo.Transactions t WHERE t.OrderId = o.OrderId AND t.Code = 'TX36'))
OR (t3 = 'TX38' AND EXISTS (SELECT t.Code FROM dbo.Transactions t WHERE t.OrderId = o.OrderId AND t.Code = 'TX39'))
The inner SELECT
should return only Orders linked to Transactions with codes TX34, TX35, and either TX35, TX37 or TX38. We keep a copy of this last code in the results.
Then we have to further narrow down the list, by keeping Orders whose 3rd code was either TX37 (no further conditions needed) or orders which have the remaining code associated to them.
I think this approach should perform better than joining in the Transactions table four times without filtering it first: it should require O*(T+T+T)*(T+T) = 6*O*T^2
iterations, whereas the four-unfiltered-joins approach would require O*T*T*T*T = O*T^4
iterations.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.