简体   繁体   English

删除 SQL 中 UNION 后的重复项

[英]Remove duplicates after UNION in SQL

I have two tables ( T1 and T2 )我有两张桌子( T1T2

-First, I select V1 , V2 , V3 , and V4 from T1 and remove duplicates based on V1 and V2 columns using the row_number() function. -首先,我从T1中 select V1V2V3V4并使用row_number() function 基于V1V2列删除重复项。

-Second, I select V1 , V2 , V3 , and V4 from T2 and remove duplicates based on V1 and V2 columns using the row_number() function. -其次,我从T2中的 select V1V2V3V4并使用row_number() function 基于V1V2列删除重复项。

-Third, I used UNION to stack these two tables. -第三,我用UNION来堆叠这两张桌子。

(WITH cte1 AS(
SELECT v1, v2, v3, v4,
row_number()over (PARTITION BY V1, V2  ORDER BY V1) rn
FROM T1)
SELECT V1, V2, V3, V4
FROM cte1 WHERE rn=1)
UNION
(WITH cte2 AS(
SELECT v1, v2, v3, v4,
row_number()over (PARTITION BY V1, V2  ORDER BY V1) rn
FROM T2)
SELECT V1, V2, V3, V4
FROM cte2 WHERE rn=1)

Now my question is: how can I remove duplicates from the final stacked table above using columns V1 , V2 , and V3 ?现在我的问题是:如何使用列V1V2V3从上面的最终堆叠表中删除重复项?

NOTE: If there are duplicates in the final stacked table, then I need to remove the records where V4 =NULL.注意:如果最终堆叠表中有重复项,则需要删除V4 =NULL 的记录。 However, if no duplicates exist in the final stacked table, I still need to keep records where V4 =NULL.但是,如果最终堆叠表中不存在重复项,我仍然需要保留V4 =NULL 的记录。

You can use the same process of removing duplicates as you have used for both tables.您可以使用与两个表相同的删除重复项的过程。 It would look something like this:它看起来像这样:

WITH cteUnion AS
(   SELECT *, ROW_NUMBER() OVER (PARTITION BY V1,V2,V3 ORDER BY V1) AS rn
    FROM (
        (WITH cte1 AS(
        SELECT v1, v2, v3, v4,
        row_number()over (PARTITION BY V1, V2  ORDER BY V1) rn
        FROM T1)
        SELECT V1, V2, V3, V4
        FROM cte1 WHERE rn=1)
        UNION
        (WITH cte2 AS(
        SELECT v1, v2, v3, v4,
        row_number()over (PARTITION BY V1, V2  ORDER BY V1) rn
        FROM T2)
        SELECT V1, V2, V3, V4
        FROM cte2 WHERE rn=1)
    ) as union
)
SELECT *
FROM cteUnion
WHERE rn = 1

or you can use DISTINCT if you just want columns V1,V2,V3:或者,如果您只想要列 V1、V2、V3,则可以使用DISTINCT

SELECT DISTINCT V1,V2,V3
FROM (
    (WITH cte1 AS(
    SELECT v1, v2, v3, v4,
    row_number()over (PARTITION BY V1, V2  ORDER BY V1) rn
    FROM T1)
    SELECT V1, V2, V3, V4
    FROM cte1 WHERE rn=1)
    UNION
    (WITH cte2 AS(
    SELECT v1, v2, v3, v4,
    row_number()over (PARTITION BY V1, V2  ORDER BY V1) rn
    FROM T2)
    SELECT V1, V2, V3, V4
    FROM cte2 WHERE rn=1)
) as union

UNION is the same as SELECT DISTINCT while UNION ALL is like using SELECT but for both tables. UNION 与 SELECT DISTINCT 相同,而 UNION ALL 就像使用 SELECT 但对于两个表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM