[英]Remove duplicates after UNION in SQL
I have two tables ( T1
and T2
)我有两张桌子( T1
和T2
)
-First, I select V1
, V2
, V3
, and V4
from T1
and remove duplicates based on V1
and V2
columns using the row_number()
function. -首先,我从T1
中 select V1
、 V2
、 V3
和V4
并使用row_number()
function 基于V1
和V2
列删除重复项。
-Second, I select V1
, V2
, V3
, and V4
from T2
and remove duplicates based on V1
and V2
columns using the row_number()
function. -其次,我从T2
中的 select V1
、 V2
、 V3
和V4
并使用row_number()
function 基于V1
和V2
列删除重复项。
-Third, I used UNION
to stack these two tables. -第三,我用UNION
来堆叠这两张桌子。
(WITH cte1 AS(
SELECT v1, v2, v3, v4,
row_number()over (PARTITION BY V1, V2 ORDER BY V1) rn
FROM T1)
SELECT V1, V2, V3, V4
FROM cte1 WHERE rn=1)
UNION
(WITH cte2 AS(
SELECT v1, v2, v3, v4,
row_number()over (PARTITION BY V1, V2 ORDER BY V1) rn
FROM T2)
SELECT V1, V2, V3, V4
FROM cte2 WHERE rn=1)
Now my question is: how can I remove duplicates from the final stacked table above using columns V1
, V2
, and V3
?现在我的问题是:如何使用列V1
、 V2
和V3
从上面的最终堆叠表中删除重复项?
NOTE: If there are duplicates in the final stacked table, then I need to remove the records where V4
=NULL.注意:如果最终堆叠表中有重复项,则需要删除V4
=NULL 的记录。 However, if no duplicates exist in the final stacked table, I still need to keep records where V4
=NULL.但是,如果最终堆叠表中不存在重复项,我仍然需要保留V4
=NULL 的记录。
You can use the same process of removing duplicates as you have used for both tables.您可以使用与两个表相同的删除重复项的过程。 It would look something like this:它看起来像这样:
WITH cteUnion AS
( SELECT *, ROW_NUMBER() OVER (PARTITION BY V1,V2,V3 ORDER BY V1) AS rn
FROM (
(WITH cte1 AS(
SELECT v1, v2, v3, v4,
row_number()over (PARTITION BY V1, V2 ORDER BY V1) rn
FROM T1)
SELECT V1, V2, V3, V4
FROM cte1 WHERE rn=1)
UNION
(WITH cte2 AS(
SELECT v1, v2, v3, v4,
row_number()over (PARTITION BY V1, V2 ORDER BY V1) rn
FROM T2)
SELECT V1, V2, V3, V4
FROM cte2 WHERE rn=1)
) as union
)
SELECT *
FROM cteUnion
WHERE rn = 1
or you can use DISTINCT
if you just want columns V1,V2,V3:或者,如果您只想要列 V1、V2、V3,则可以使用DISTINCT
:
SELECT DISTINCT V1,V2,V3
FROM (
(WITH cte1 AS(
SELECT v1, v2, v3, v4,
row_number()over (PARTITION BY V1, V2 ORDER BY V1) rn
FROM T1)
SELECT V1, V2, V3, V4
FROM cte1 WHERE rn=1)
UNION
(WITH cte2 AS(
SELECT v1, v2, v3, v4,
row_number()over (PARTITION BY V1, V2 ORDER BY V1) rn
FROM T2)
SELECT V1, V2, V3, V4
FROM cte2 WHERE rn=1)
) as union
UNION is the same as SELECT DISTINCT while UNION ALL is like using SELECT but for both tables. UNION 与 SELECT DISTINCT 相同,而 UNION ALL 就像使用 SELECT 但对于两个表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.