简体   繁体   English

SQL查询为列中的每个唯一值返回一条记录

[英]SQL query to return one single record for each unique value in a column

I have a table in SQL Server 2000 that I am trying to query in a specific way. 我在SQL Server 2000中有一个表,我试图以特定的方式查询。 The best way to show this is with example data. 显示此信息的最佳方式是使用示例数据。

Behold, [Addresses] : 看, [Addresses]

Name         Street                 City          State
--------------------------------------------------------
Bob          123 Fake Street        Peoria        IL
Bob          234 Other Street       Fargo         ND
Jim          345 Main Street        St Louis      MO

This is actually a simplified example of the structure of the actual table. 这实际上是实际表结构的简化示例。 The structure of the table is completely beyond my control. 桌子的结构完全超出了我的控制范围。 I need a query that will return a single address per name. 我需要一个查询,每个名称将返回一个地址。 It doesn't matter which address, just that there is only one. 哪个地址无关紧要,只有一个地址。 The result could be this: 结果可能是这样的:

Name         Street                 City          State
--------------------------------------------------------
Bob          123 Fake Street        Peoria        IL
Jim          345 Main Street        St Louis      MO

I found a similar question here , but none of the solutions given work in my case because I do not have access to CROSS APPLY , and calling MIN() on each column will mix different addresses together, and although I don't care which record is returned, it must be one intact row, not a mix of different rows. 我在这里找到了一个类似的问题,但没有一个解决方案在我的情况下工作,因为我没有访问CROSS APPLY ,并且在每列上调用MIN()会将不同的地址混合在一起,虽然我不关心哪个记录返回时,它必须是一个完整的行,而不是不同行的混合。

Recommendations to change the table structure will not help me. 改变表结构的建议对我没有帮助。 I agree that this table is terrible, (it's worse than shown here) but this is part of a major ERP database that I can not change. 我同意这个表很糟糕(这比这里显示的更糟)但这是我无法改变的主要ERP数据库的一部分。

There are about 3000 records in this table. 此表中有大约3000条记录。 There is no primary key. 没有主键。

Any ideas? 有任何想法吗?

Well, this will give you pretty bad performance, but I think it'll work 好吧,这会给你很糟糕的表现,但我认为它会奏效

SELECT t.Name, t.Street, t.City, t.State
FROM table t 
INNER JOIN (
     SELECT m.Name, MIN(m.Street + ';' + m.City  + ';' + m.State) AS comb
     FROM table m
     GROUP BY m.Name
) x
   ON  x.Name = t.Name
   AND x.comb = t.Street + ';' + t.City  + ';' + t.State

Use a temp table or table variable and select a distinct list of names into that. 使用临时表或表变量,并在其中选择一个不同的名称列表。 Use that structure then to select the top 1 of each record in the original table for each distinct name. 然后使用该结构为原始表中的每个记录选择每个不同名称的前1。

If you can use a temp table: 如果你可以使用临时表:

select * -- Create and populate temp table 
into #Addresses
from Addresses 

alter table #Addresses add PK int identity(1, 1) primary key

select Name, Street, City, State 
-- Explicitly name columns here to not return the PK
from #Addresses A
where not exists 
    (select *
    from #Addresses B
    where B.Name = A.Name
    and A.PK > B.PK)

This solution would not be advisable for much larger tables. 对于更大的表,此解决方案不可取。

select distinct Name , street,city,state
from table t1 where street =  
(select min(street) from table t2 where t2.name = t1.name)

选择名称,街道,城市,州FROM(选择名称,街道,城市,州,ROW_NUMBER()OVER(按名称分区名称ORDER BY名称)AS rn from table)AS t WHERE rn = 1

A temporary table solution would be as follows 临时表解决方案如下

CREATE Table #Addresses
(
    MyId int IDENTITY(1,1),
    [Name] NVARCHAR(50),
    Street NVARCHAR(50),
    City NVARCHAR(50),
    State NVARCHAR(50)
)

INSERT INTO #Addresses ([Name], Street, City, State) SELECT [Name], Street, City, State FROM Addresses

SELECT
    Addresses1.[Name],
    Addresses1.Street,
    Addresses1.City,
    Addresses1.State
FROM
    #Addresses Addresses1
WHERE
    Addresses1.MyId =
(
    SELECT
        MIN(MyId)
    FROM
        #Addresses Addresses2
    WHERE
        Addresses2.[Name] = Addresses1.[Name]
)

DROP TABLE #Addresses

This is ugly as hell, but it sounds like your predicament is ugly, too... so here goes... 这很丑陋,但听起来你的困境也是丑陋的......所以这里......

select  name,
    (select top 1 street from [Addresses] a1 where a1.name = a0.name) as street,
    (select top 1 city from [Addresses] a2 where a2.name = a0.name) as city,
    (select top 1 state from [Addresses] a3 where a3.name = a0.name) as state
from    (select distinct name from [Addresses]) as a0

I think this is a good candidate for a cursor based solution. 我认为这是基于游标的解决方案的一个很好的选择。 It's been so long since I've used a cursor that I won't attempt to write the T-SQL but here's the idea: 我用过光标已经很久了,我不会尝试编写T-SQL,但这里的想法是:

  1. Create temp table with same schema as Addresses 使用与Addresses相同的模式创建临时表
  2. Select distinct Names into cursor 选择不同的名称到光标
  3. Loop through cursor selecting top 1 from Addresses into temp table for each distinct Name 循环遍历游标,从地址中选择顶部1,为每个不同的名称选择临时表
  4. Return select from temp table 从临时表中返回选择
select c.*, b.* from companies c left outer join 
(SELECT *,
    ROW_NUMBER()
        OVER(PARTITION BY FKID ORDER BY PKId) AS Seq
 FROM Contacts) b on b.FKID = c.PKID and b.Seq = 1

A slight modification on the above should work. 对上述内容稍加修改应该有效。

SELECT Name, Street, City, State
FROM table t 
INNER JOIN (
     SELECT Name, MIN(Street) AS Street
     FROM table m
     GROUP BY Name
) x
   ON x.Name = t.Name AND x.Street = t.Street

Now this won't work if you have the same street but the other pieces of information are different (eg with typos). 现在,如果您拥有相同的街道,但其他信息不同(例如,使用拼写错误),这将无效。

OR a more complete hash would include all the fields (but you likely have too many for performance): 或者更完整的哈希将包括所有字段(但您可能有太多的性能):

SELECT Name, Street, City, State
FROM table t 
INNER JOIN (
     SELECT Name, MIN(Street + '|' + City  + '|' + State) AS key
     FROM table m
     GROUP BY Name
) x
   ON  x.Name = t.Name
   AND x.key = Street + '|' + City  + '|' + State

I don't think that you can do that, given your constraints. 考虑到你的限制,我不认为你能做到这一点。 You can pull out distinct combinations of those fields. 您可以提取这些字段的不同组合。 But if someone spelled Bob and Bobb with the same address you'd end up with two records. 但如果有人用同一地址拼写Bob和Bobb,你最终会得到两条记录。 [GIGO] You are correct that any grouping (short of grouping on all of the fields-equivalent to DISTINCT) will mix rows. [GIGO]你是正确的,任何分组(在所有字段上分组 - 相当于DISTINCT)都会混合行。 It's too bad that you don't have a unique identifier for each customer. 您没有为每个客户提供唯一标识符太糟糕了。

You might be able to nest queries together in such as way as to select the top 1 for each name and join all of those together. 您可以将查询嵌套在一起,以便为每个名称选择前1并将所有这些连接在一起。

SELECT name,
       ( SELECT TOP 1 street, city, state
           FROM addresses b
          WHERE a.name = b.name )
  FROM addresses a
 GROUP BY name
SELECT name, street, address, state
FROM
 (SELECT name, street, address, state,
  DENSE_RANK() OVER (PARTITION BY name ORDER BY street DESC) AS r 
 FROM tbl) AS t
WHERE r = 1; 

And still another way: 还有另一种方式:

-- build a sample table  
DECLARE @T TABLE (Name VARCHAR(50),Street VARCHAR(50),City VARCHAR(50),State VARCHAR(50))  
INSERT INTO @T   
SELECT 'Bob','123 Fake Street','Peoria','IL' UNION  
SELECT 'Bob','234 Other Street','Fargo','ND' UNION  
SELECT 'Jim','345 Main Street','St Louis','MO' UNION  
SELECT 'Fred','234 Other Street','Fargo','ND'  

-- here is all you do to get the unique record  
SELECT * FROM @T a WHERE (SELECT COUNT(*) FROM @T b WHERE a.Name = b.name and a.street <= b.street) = 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM