简体   繁体   English

SQL压缩表-删除类似项目

[英]SQL Compressing Table - Removing Like Items

Have a table with ID, IDLicense, Brand, and ExtraBrands 有一个包含ID,IDLicense,Brand和ExtraBrands的表

Trying to grab all like records by an IDLicense combined all records by taking all copies of IDLicense deleting all copies but taking the brand name and adding it to the original IDLicense and adding the brand of the deleted copy to the ExtraBrands. 尝试通过IDLicense抓取所有类似的记录来合并所有记录,方法是获取IDLicense的所有副本,删除所有副本,但保留品牌名称并将其添加到原始IDLicense,然后将已删除副本的品牌添加到ExtraBrands。

So far I have been able to select all IDLicense that have duplicates. 到目前为止,我已经能够选择所有具有重复项的IDLicense。 Using a temp table to store all extra info. 使用临时表存储所有其他信息。

INSERT INTO #TempTable (ID, IDLicense, Brand, ExtraBrands) 
SELECT ID, IDLicense, Brand, ExtraBrands FROM BrandOrders
WHERE IDLicense IN (SELECT IDLicense FROM BrandOrders GROUP BY IDLicense HAVING COUNT(*) > 1)

is a simple way to instead of using a temp table here to instead just delete all like data and take brands from copies and add them as ExtraBrands? 是一种简单的方法,而不是在此处使用临时表来删除所有相似的数据并从副本中获取品牌并将其添加为ExtraBrands? Afterwards deleting the duplicates. 然后删除重复项。

Data Examples: 数据示例:

Table Below: 下表:

 1. IdLicense = 1, Brand="BlueBird", ExtraBrands is null
 2. IdLicense = 1, Brand="RedBird", ExtraBrands is null
 3. IdLicense = 1, Brand="YellowBird", ExtraBrands is null
 4. IdLicense = 2, Brand="BlueBird", ExtraBrands is null
 5. IdLicense = 2, Brand="RedBird", ExtraBrands is null

At the end it should all be compressed to 最后应将其全部压缩为

 1. IdLicense = 1, Brand="BlueBird", ExtraBrands = "RedBird YellowBird"
 2. IdLicense = 2, Brand="BlueBird", ExtraBrands = "RedBird"

You can do what you want using the code below, but I would advise against this kind of denormalization of the database. 您可以使用下面的代码执行所需的操作,但是我建议您不要对数据库进行这种非规范化。 Storing multiple discrete values in a single column breaks the relational model and often leads to various problems. 在单个列中存储多个离散值会破坏关系模型,并经常导致各种问题。

Instead I'd advise you to normalize your tables and use a schema like below where you have a junction table that connects License entities with Brand entities: 相反,我建议您对表进行规范化,并使用如下所示的模式,在该表中,有一个连接表,用于将许可实体与品牌实体连接起来:

CREATE TABLE BrandOrders (IdLicense int primary key);
CREATE TABLE Brands (BrandID int primary key, Brand varchar(20));
CREATE TABLE LicenseBrands (
    IdLicense int foreign key references BrandOrders, 
    BrandID int foreign key references Brands, 
    MainBrand bit,
    PRIMARY KEY (IdLicense, BrandId)
);

This would both ensure data integrity plus save you space and it is also a lot easier to work with. 这样既可以确保数据完整性,又可以节省空间,并且使用起来也容易得多。


Having said that, here is the queries (update, then delete) to "fix" your data: 话虽如此,这是查询(更新,然后删除)以“修复”您的数据:

;with cte as (
    select *, r=row_number() over (partition by idlicense order by id) 
    from brandorders
    where idlicense in (
       select idlicense from brandorders group by idlicense having count(*) > 1
    )
)

update extern
set extrabrands = left(c , len(c)-1) 
from cte extern
cross apply
(
    select brand + ','
    from cte as intern
    where extern.idlicense = intern.idlicense and r > 1
    for xml path('')
) extrabrands (c)
where extern.r = 1;

delete from brandorders 
where idlicense in (
    select idlicense from brandorders group by idlicense having count(*) > 1
    ) 
  and extrabrands is null;

The result after executing would be that your data looks like this: 执行后的结果将是您的数据如下所示:

ID  IdLicense   Brand       ExtraBrands
1   1           BlueBird    RedBird,YellowBird
4   2           BlueBird    RedBird

Sample SQL Fiddle 示例SQL提琴

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM