简体   繁体   English

Oracle SQL - 根据两列选择重复项

[英]Oracle SQL - Select duplicates based on two columns

I need to select duplicate rows based on two columns in a join, and i can't seem to figure out how that is done. 我需要根据连接中的两列选择重复的行,我似乎无法弄清楚如何完成。

Currently i got this: 目前我得到了这个:

SELECT s.name,administrative_site_id as adm_id,s.external_code,si.identifier_value 
FROM suppliers s
INNER JOIN suppliers_identifier si
ON s.id = si.supplier_id

And the output is something along the lines of below: 输出结果如下:

| Name       | adm_id      | external_code |identifier_value  |
|:-----------|------------:|:------------: |:----------------:|
| Warlob     |     66323   |    ext531     |    id444         |
| Ozzy       |     53123   |    ext632     |    id333         |
| Motorhead  |     521     |    ext733     |    id222         |
| Perez      |     123     |    ext833     |    id111         |
| Starlight  |     521     |    ext934     |    id222         |
| Aligned    |     123     |    ext235     |    id111         |

What i am looking for, is how to simply select these 4 rows, as they are duplicates based on column: adm_id and Identifier_value 我正在寻找的是如何简单地选择这4行,因为它们是基于列的重复:adm_id和Identifier_value

| Name       | adm_id      | external_code |identifier_value  |
|:-----------|------------:|:------------: |:----------------:|
| Motorhead  |     521     |    ext733     |    id222         |
| Perez      |     123     |    ext833     |    id111         |
| Starlight  |     521     |    ext934     |    id222         |
| Aligned    |     123     |    ext235     |    id111         |

First group by ADM_ID, IDENTIFIER_VALUE and find groups that has more than one row in it. 第一组由ADM_ID,IDENTIFIER_VALUE组成,并查找其中包含多行的组。 Then select all rows that has these couples 然后选择具有这些对的所有行

SELECT S.NAME
      ,ADMINISTRATIVE_SITE_ID AS ADM_ID
      ,S.EXTERNAL_CODE
      ,SI.IDENTIFIER_VALUE
  FROM SUPPLIERS S INNER JOIN SUPPLIERS_IDENTIFIER SI ON S.ID = SI.SUPPLIER_ID
 WHERE (ADMINISTRATIVE_SITE_ID, SI.IDENTIFIER_VALUE) IN (SELECT ADMINISTRATIVE_SITE_ID AS ADM_ID, SI.IDENTIFIER_VALUE
                                                           FROM SUPPLIERS S INNER JOIN SUPPLIERS_IDENTIFIER SI ON S.ID = SI.SUPPLIER_ID
                                                         GROUP BY ADM_ID, IDENTIFIER_VALUE
                                                         HAVING COUNT(*) > 1)

Or an alternate way that may perform better on big datasets: 或者可以在大数据集上执行更好的替代方法:

with t as (
SELECT s.name,administrative_site_id as adm_id,s.external_code,si.identifier_value 
COUNT(*) OVER (PARTITION BY administrative_site_id ,identifier_value ) AS cnt
FROM suppliers s
INNER JOIN suppliers_identifier si
ON s.id = si.supplier_id)
select name, adm_id, external_code, identifier_value 
from t
where cnt > 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM