简体   繁体   中英

SQL statement to select group containing all of a set of values

In SQL Server 2005, I have an order details table with an order id and a product id. I want to write a sql statement that finds all orders that have all the items within a particular order. So, if order 5 has items 1, 2, and 3, I would want all other orders that also have 1, 2, and 3. Also, if order 5 had 2 twice and 3 once, I'd want all other orders with two 2s and a 3.

My preference is that it return orders that match exactly, but orders that are a superset are acceptable if that's much easier / performs much better.

I tried a self-join like the following, but that found orders with any of the items rather than all of the items.

SELECT * FROM Order O1
JOIN Order O2 ON (O1.ProductId = O2.ProductId)
WHERE O2.OrderId = 5

This also gave me duplicates if order 5 contained the same item twice.

If the OrderDetails table contains a unique constraint on OrderId and ProductId, then you can do something like this:

Select ...
From Orders As O
Where Exists    (
                Select 1
                From OrderDetails As OD1
                Where OD1.ProductId In(1,2,3)
                    And OD1.OrderId = O.Id
                Group By OD1.OrderId
                Having Count(*) = 3
                )

If it is possible to have the same ProductId on the same Order multiple times, then you could change the Having clause to Count(Distinct ProductId) = 3

Now, given the above, if you want the situation where each order has the same signature with duplicate product entries, that is trickier. To do that you would need the signature of order in question over the products in question and then query for that signature:

With OrderSignatures As
    (
    Select O1.Id
        ,   (
            Select '|' + Cast(OD1.ProductId As varchar(10))
            From OrderDetails As OD1
            Where OD1.OrderId = O1.Id
            Order By OD1.ProductId
            For Xml Path('')
            ) As Signature
    From Orders As O1
    )
Select ...
From OrderSignatures As O
    Join OrderSignatures As O2
        On O2.Signature = O.Signature
            And O2.Id <> O.Id
Where O.Id = 5

This sort of thing is very difficult to do in SQL, as SQL is designed to generate its result set by, at the most basic level, comparing a set of column values on a single row each to another value. What you're trying to do is compare a single column value (or set of column values) on multiple rows to another set of multiple rows .

In order to do this, you'll have to create some kind of order signature. Strictly speaking, this isn't possible to do using query syntax alone; you'll have to use some T-SQL.

declare @Orders table 
(
    idx int identity(1, 1), 
    OrderID int, 
    Signature varchar(MAX)
)
declare @Items table 
(
    idx int identity(1, 1), 
    ItemID int, 
    Quantity int
)

insert into @Orders (OrderID) select OrderID from [Order]

declare @i int
declare @cnt int

declare @j int
declare @cnt2 int

select @i = 0, @cnt = max(idx) from @Orders

while @i < @cnt
begin
    select @i = @i + 1

    declare @temp varchar(MAX)

    delete @Items

    insert into @Items (ItemID, Quantity)
    select 
        ItemID, 
        Count(ItemID) 

    from OrderItem oi    

    join @Orders o on o.idx = @i and o.OrderID = oi.OrderID

    group by oi.ItemID

    order by oi.ItemID

    select @j = min(idx) - 1, @cnt2 = max(idx) from @Items

    while @j < @cnt2
    begin
        select @j = @j + 1

        select @temp = isnull(@temp + ', ','') + 
            '(' + 
            convert(varchar,i.ItemID) + 
            ',' + 
            convert(varchar, i.Quantity) + 
            ')'
        from @Items i where idx = @j
    end

    update @Orders set Signature = @temp where idx = @i

    select @temp = null
end

select 
    o_other.OrderID 

from @Orders o

join @Orders o_other on 
        o_other.Signature = o.Signature
    and o_other.OrderID <> o.OrderID

where o.OrderID = @OrderID

This assumes (based on the wording of your question) that ordering multiple of the same item in an order will result in multiple rows, rather than using a Quantity column. If the latter is the case, just remove the group by from the @Items population query and replace Count(ItemID) with Quantity .

I think this should work. I'm using 108 as an example OrderID, so you'll have to replace that twice below or use a variable.

WITH TempProducts(ProductID) AS
(
   SELECT DISTINCT ProductID FROM CompMarket
   WHERE OrderID = 108
)
SELECT OrderID  FROM CompMarket 
WHERE ProductID IN (SELECT ProductID FROM TempProducts) 
AND OrderID != 108
GROUP BY OrderID
HAVING COUNT(DISTINCT ProductID) >= (SELECT COUNT(ProductID) FROM TempProducts)

This uses a CTE to get a list of an Order's Products, then selects all order IDs that have products that are all in this list. To make sure that the Orders returned have all the products, this compares the Count of the CTE to the Counts of the returned Order's Products.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM