简体   繁体   中英

Identify duplicates within columns using sql (oracle)

I have the following dataset:

FLD_NB|RGN_CD
1     |NC
2     |SC
1     |MA
3     |GA
3     |MA

I am trying to identify all records which are available in more than 1 RGN_CD , so eg the scenario above, FLD_NB=1 is available in both RGN_CD='NC' and RGN_CD='MA'

What might be the best way to identify the rows which has multiple instances of FLD_NB across RGN_CD ?

You can use group by and having :

select fld_nb
from mytable
group by fld_nb
having count(*) > 1

This gives you all fld_nb s that appear more than once. Or, if you want fld_nb s that have more than one distinct rgn_cd , you can change the having clause to:

having count(distinct rgn_cd) > 1

Probably this is what you need:

select *
from (
     select t.*
           ,count(*)over(partition by FLD_NB) cnt
     from t
     )
where cnt>1;

Full tests case with results:

with t (FLD_NB,RGN_CD) as (
select 1, 'NC' from dual union all
select 2, 'SC' from dual union all
select 1, 'MA' from dual union all
select 3, 'GA' from dual union all
select 3, 'MA' from dual 
)
select *
from (
     select t.*
           ,count(*)over(partition by FLD_NB) cnt
     from t
     )
where cnt>1;

Results:

    FLD_NB RG        CNT
---------- -- ----------
         1 NC          2
         1 MA          2
         3 MA          2
         3 GA          2

In case if you need to count just distinct values:

select *
from (
     select t.*
           ,count(distinct RGN_CD)over(partition by FLD_NB) cnt
     from t
     )
where cnt>1;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM