简体   繁体   中英

SQL Compressing Table - Removing Like Items

Have a table with ID, IDLicense, Brand, and ExtraBrands

Trying to grab all like records by an IDLicense combined all records by taking all copies of IDLicense deleting all copies but taking the brand name and adding it to the original IDLicense and adding the brand of the deleted copy to the ExtraBrands.

So far I have been able to select all IDLicense that have duplicates. Using a temp table to store all extra info.

INSERT INTO #TempTable (ID, IDLicense, Brand, ExtraBrands) 
SELECT ID, IDLicense, Brand, ExtraBrands FROM BrandOrders
WHERE IDLicense IN (SELECT IDLicense FROM BrandOrders GROUP BY IDLicense HAVING COUNT(*) > 1)

is a simple way to instead of using a temp table here to instead just delete all like data and take brands from copies and add them as ExtraBrands? Afterwards deleting the duplicates.

Data Examples:

Table Below:

 1. IdLicense = 1, Brand="BlueBird", ExtraBrands is null
 2. IdLicense = 1, Brand="RedBird", ExtraBrands is null
 3. IdLicense = 1, Brand="YellowBird", ExtraBrands is null
 4. IdLicense = 2, Brand="BlueBird", ExtraBrands is null
 5. IdLicense = 2, Brand="RedBird", ExtraBrands is null

At the end it should all be compressed to

 1. IdLicense = 1, Brand="BlueBird", ExtraBrands = "RedBird YellowBird"
 2. IdLicense = 2, Brand="BlueBird", ExtraBrands = "RedBird"

You can do what you want using the code below, but I would advise against this kind of denormalization of the database. Storing multiple discrete values in a single column breaks the relational model and often leads to various problems.

Instead I'd advise you to normalize your tables and use a schema like below where you have a junction table that connects License entities with Brand entities:

CREATE TABLE BrandOrders (IdLicense int primary key);
CREATE TABLE Brands (BrandID int primary key, Brand varchar(20));
CREATE TABLE LicenseBrands (
    IdLicense int foreign key references BrandOrders, 
    BrandID int foreign key references Brands, 
    MainBrand bit,
    PRIMARY KEY (IdLicense, BrandId)
);

This would both ensure data integrity plus save you space and it is also a lot easier to work with.


Having said that, here is the queries (update, then delete) to "fix" your data:

;with cte as (
    select *, r=row_number() over (partition by idlicense order by id) 
    from brandorders
    where idlicense in (
       select idlicense from brandorders group by idlicense having count(*) > 1
    )
)

update extern
set extrabrands = left(c , len(c)-1) 
from cte extern
cross apply
(
    select brand + ','
    from cte as intern
    where extern.idlicense = intern.idlicense and r > 1
    for xml path('')
) extrabrands (c)
where extern.r = 1;

delete from brandorders 
where idlicense in (
    select idlicense from brandorders group by idlicense having count(*) > 1
    ) 
  and extrabrands is null;

The result after executing would be that your data looks like this:

ID  IdLicense   Brand       ExtraBrands
1   1           BlueBird    RedBird,YellowBird
4   2           BlueBird    RedBird

Sample SQL Fiddle

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM