简体   繁体   中英

SQL - “NOT IN” in WHERE clause using INNER JOIN not working

I need to filter a table based in a sub table data.

I'll exemplify with a hypnotic data to be easier to explain:

  • Master table: Cars
  • Sub table: Attributes (like Color , car type , accessories )

These attributes have an id ( idOption ) and the selected value ( idList )

So, in an example, I need to filter all the cars with the color ( idOption = 10 ) yellow ( idList = 45 ). I can't filter this directly because the search need to consider the other option's results (which include the types, accessories.

When I use NOT IN for just one table, it works. But when I use merging the 2 tables with INNER JOIN , it does not work.

So in summary, I need to filter the 3 idOption (when is not NULL) with a given value, and this needs to reflect in the main table, grouped by product.

Table Cars :

idProduct  |  Description
1             Product A
2             Product B
3             Product C

Table Attributes :

idRow   idProduct   idOption    idList
---------------------------------------
1       1           10          45
2       2           10          46
3       3           10          47
4       1           11          10
5       2           11          98
6       1           14          56
7       3           16          28
8       2           20          55

This is the stored procedure that I created which is not working:

ALTER PROCEDURE [dbo].[SP_GET_TestSearch]
    (@Param1 BIGINT = NULL,
     @PValue1 BIGINT = NULL,
     @Param2 BIGINT = NULL,
     @PValue2 BIGINT = NULL,
     @Param3 BIGINT = NULL,
     @PValue3 BIGINT = NULL)
AS
    SET NOCOUNT ON;

    SELECT
        Cars.idProduct,
        Cars.[Description]
    FROM 
        Cars
    INNER JOIN 
        Attributes ON Cars.idProduct = Attributes.idProduct
    WHERE
        ((@Param1 IS NULL OR (idOption NOT IN (@Param1)))
        AND
        (@Param2 IS NULL OR (idOption NOT IN (@Param2)))
        AND
        (@Param3 IS NULL OR (idOption NOT IN (@Param3))))
        OR
        (idOption = ISNULL(@Param1, NULL) 
         AND idList = ISNULL(@PValue1, NULL))
        OR
        (idOption = ISNULL(@Param2, NULL) 
         AND idList = ISNULL(@PValue2, NULL))
        OR
        (idOption = ISNULL(@Param3, NULL) 
         AND idList = ISNULL(@PValue3, NULL))
    GROUP BY 
        Cars.idProduct, Cars.[Description]

A few things here.

Firstly, this kind of catch all procedure is a bit of an anti pattern for all sorts of reasons, see here for a full explanation:- https://sqlinthewild.co.za/index.php/2018/03/13/revisiting-catch-all-queries/

Secondly, you need to be very careful of using NOT IN with nullable values in a list: http://www.sqlbadpractices.com/using-not-in-operator-with-null-values/

I've added the DDL for the tables:-

IF OBJECT_ID('Attributes') IS NOT NULL
    DROP TABLE Attributes;

IF OBJECT_ID('Cars') IS NOT NULL
    DROP TABLE Cars;

IF OBJECT_ID('SP_GET_TestSearch') IS NOT NULL
    DROP PROCEDURE SP_GET_TestSearch

CREATE TABLE Cars
(idProduct INT PRIMARY KEY
, Description VARCHAR(20) NOT NULL);

CREATE TABLE Attributes
(idRow INT PRIMARY KEY
, idProduct INT NOT NULL FOREIGN KEY REFERENCES dbo.Cars(idProduct)
, idOption INT NOT NULL
, idList INT NOT NULL);

INSERT INTO dbo.Cars
VALUES
(1, 'Product A')
,(2 , 'Product B')
,(3, 'Product C');

INSERT INTO dbo.Attributes
(
    idRow,
    idProduct,
    idOption,
    idList
)
VALUES (1,1,10,45)
,(2,2,10,46)
,(3,3,10,47)
,(4,1,11,10)
,(5,2,11,98)
,(6,1,14,56)
,(7,3,16,28)
,(8,2,20,55);
GO

The issue with your query, is that the first part of the block is always evaluated to TRUE for any idOption that you don't specify:-

((@Param1 IS NULL OR (idOption NOT IN (@Param1)))
AND
(@Param2 IS NULL OR (idOption NOT IN (@Param2)))
AND
(@Param3 IS NULL OR (idOption NOT IN (@Param3))))

To explain; if I pass in the following:-

DECLARE @Param1 BIGINT
, @Param2 BIGINT
, @Param3 BIGINT
, @PValue1 BIGINT
, @PValue2 BIGINT
, @PValue3 BIGINT;

SET @Param1 = 11
SET @Pvalue1 = 42
SET @Param2 = 11
SET @Pvalue2 = 10
SET @Param3 = 14
SET @PValue3= 56

EXEC dbo.SP_GET_TestSearch @Param1, @PValue1, @Param2, @PValue2, @Param3, @PValue3

Then you effectively have WHERE idOption NOT IN (11,14) as the evaluation for the first part of the clause, so all other rows are returned.

I suspect you really want the WHERE clause to be:-

WHERE 
       (@Param1 IS NULL AND @Param2 IS NULL AND @Param3 IS NULL)
       OR
       (idOption = @Param1 
         AND idList = @PValue1)
        OR
        (idOption = @Param2 
         AND idList = @PValue2)
        OR
        (idOption = @Param3 
         AND idList = @PValue3)

The following code demonstrates how to implement the logic of excluding vehicles from query results if they have any "bad" property values. The rejection is handled by ... where not exists ... which is used to check each car against the "bad" property values.

Rather than using an assortment of (hopefully) paired parameters to pass the undesirable properties, the values are passed in a table. The stored procedure to implement this ought to use a table-valued parameter (TVP) to pass the table.

-- Sample data.
declare @Cars as Table ( CarId Int Identity, Description VarChar(16) );
insert into @Cars ( Description ) values
  ( 'Esplanade' ), ( 'Tankigator' ), ( 'Land Yacht' );
select * from @Cars;

declare @Properties as Table ( PropertyId Int Identity, Description VarChar(16) );
insert into @Properties ( Description ) values
  ( 'Turbochargers' ), ( 'Superchargers' ), ( 'Hyperchargers' ), ( 'Color' ), ( 'Spare Tires' );
select * from @Properties;

declare @CarProperties as Table ( CarId Int, PropertyId Int, PropertyValue Int );
insert into @CarProperties ( CarId, PropertyId, PropertyValue ) values
  ( 1, 1, 1 ), ( 1, 4, 24 ), ( 1, 4, 42 ), -- Two tone!
  ( 2, 2, 1 ), ( 2, 4, 7 ),
  ( 3, 1, 2 ), ( 3, 4, 0 ), ( 3, 5, 6 );
select C.CarId, C.Description as CarDescription,
  P.PropertyId, P.Description as PropertyDescription,
  CP.PropertyValue
  from @Cars as C inner join
    @CarProperties as CP on CP.CarId = C.CarId inner join
    @Properties as P on P.PropertyId = CP.PropertyId
  order by C.CarId, P.PropertyId;

-- Test data: Avoid vehicles that have _any_ of these property values.
--   This should be passed to the stored procedure as a table-value parameter (TVP).
declare @BadProperties as Table ( PropertyId Int, PropertyValue Int );
insert into @BadProperties ( PropertyId, PropertyValue ) values
  ( 2, 1 ), ( 2, 2 ), ( 2, 4 ),
  ( 4, 62 ), ( 4, 666 );
select BP.PropertyId, BP.PropertyValue, P.Description
  from @BadProperties as BP inner join
    @Properties as P on P.PropertyId = BP.PropertyId;

-- Query the data.
select C.CarId, C.Description as CarDescription
  from @Cars as C
  where not exists ( 
    select 42
      from @CarProperties as CP inner join
        @BadProperties as BP on BP.PropertyId = CP.PropertyId and BP.PropertyValue = CP.PropertyValue
      where CP.CarId = C.CarId )
  order by C.CarId;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM