简体   繁体   中英

Advanced Sql query to identify missing data

I am currently dealing with a sql server table 'suburb' which has a suburb_id column and an adjacent_suburb_ids column. The adjacent_suburb_ids column is a comma separated string of other suburb_ids.

I have found that some of the records are not reciprocating -

eg "SuburbA" has "SuburbB" id in adjacent_suburb_ids but "SuburbB" does not have "SuburbA" id in adjacent_suburb_ids

I need to identify all the suburbs which are not reciprocating the adjacent_suburbs, can I do this with a SQL query?

Please do not comment on the data/table structure as it is not in my control and I can't change it.

Assuming I'm understanding your question correctly, you can join the table to itself using the like and not like operators:

select s.suburb_id, s2.suburb_id as s2id
from suburb s 
  join suburb s2 on 
    s.suburb_id <> s2.suburb_id
    and ',' + s2.adjacent_suburb_ids + ',' like 
        '%,' + cast(s.suburb_id as varchar(10)) + ',%' 
    and ',' + s.adjacent_suburb_ids + ',' not like 
        '%,' + cast(s2.suburb_id as varchar(10)) + ',%' 

You need to concatenate the comma before and after to do a search within the set. And yes, if you had the chance, you should consider normalizing the data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM