简体   繁体   中英

return records that only match *all* of given IN values

The tables in question:

Persons
-------------------
PersonID (PK)
FirstName
LastName


Descriptors
-------------------
DescriptorID (PK)
Description


PersonDescriptors
-------------------
PersonID (PK)(FK)
DescriptorID (PK)(FK)

Hello,

I have three tables: Persons, Descriptors, and PersonDescriptors. I'm trying to figure out a query that:

  • takes in a list of possible person DescriptorIDs (ex: 22, 11, 9, etc)

  • reads the PersonDescriptors table of actual person-descriptors records (ex row: PersonID: 107, DescriptorID: 22)

  • return the matching PersonIDs only if ALL supplied DescriptorIDs are matched

For example, PersonID 107 has blond hair (DescriptorID: 22) so has a row in PersonDescritors of 107/22. PersonID 107 also has a beard (DescriptorID: 11) so has another row in PersonDescriptors of 107/11. If I give the query these two DescriptorIDs (22, 11), it should return PersonID 107. But if I give the query 22, 11 & 9, it should not return PersonID 107 because there's no 107/9 row in PersonDescriptors.

There are about a million rows in PersonDescriptors.

How would you code this?

One idea was to use WHERE NOT EXISTS statements, but my attempt is too slow (20+ seconds on a local SQL Server 2014 instance):

SELECT DISTINCT pd1.PersonID FROM PersonDescriptors pd1

WHERE NOT EXISTS (

    SELECT 1 FROM PersonDescriptors pd2 WHERE NOT EXISTS (

        SELECT 1 FROM PersonDescriptors pd3 
        WHERE 
                pd3.PersonID = pd1.PersonID
            AND pd3.DescriptorID = pd2.DescriptorID

    ) 
    AND pd2.DescriptorID IN (1, 71) --two descriptors. more even slower
);

...Another was to take in XML of the target DescriptorIDs and load them into a temp table, one row per DescriptorID, and try some JOIN magic, but I couldn't figure it out.

Another idea was to employ HAVING COUNT of the number of IN criteria, but I couldn't pull that off either.

Ideas I'm missing, or help? Thanks!

Use group by and having :

select pd.PersonId
from PersonDescriptors pd
where pd.DescriptorID IN (1, 71)
group by pd.PersonId
having count(distinct pd.DescriptorID) = 2;

You can easily generalize this to lists of any length. Remember to change the constant in the having clause.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM