简体   繁体   中英

SQL to identify duplicates in a tree-like structure

I am looking for a solution for this (MS SQL 2008, btw):

ID | ParentID | Feature_1 | Feature_2 +-----+------------+------------+----------+ 1 | NULL | A | B 2 | 1 | A | B 3 | 1 | A | C 4 | 2 | A | C

Whenever a child (a record with a ParentID) has the same set of features (Feature_1 and Feature_2) than its parent, I want to ignore it, essentially not show it in my select *.

So the result set should be

ID | ParentID | Feature_1 | Feature_2 +-----+------------+------------+----------+ 1 | NULL | A | B 3 | 1 | A | C 4 | 2 | A | C

Note that ID=2 is dropped, but ID=4 is displayed because it has a different set of features than its parent had.

Any help would be much appreciated!

SELECT
    Child.ID,
    Child.ParentID,
    Child.Feature_1,
    Child.Feature_2
FROM
    MyTable AS Child
    LEFT OUTER JOIN MyTable AS Parent
        ON Child.ParentID = Parent.ID
WHERE
    Parent.Feature_1 <> Child.Feature_1
    OR Parent.Feature_2 <> Child.Feature_2
    OR Child.ParentID IS NULL
ORDER BY
    Child.ID
SELECT *
FROM table A
WHERE a.ParentID IS NULL OR NOT EXISTS (SELECT 1 
                 FROM table b
                 WHERE a.ParentID = b.ID
                 AND a.Feature_1 = b.Feature_1 AND a.Feature_2 = b.Feature_2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM