在 SQL Server 中仅使用一次记录查找匹配对

Question

I need to find matched pairs of records in SQL Server, but each record can only be included in 1 pair .我需要在 SQL Server 中找到匹配的记录对，但每条记录只能包含在1 pair 中。 Once a record has been matched with a pair, it should be removed from consideration for any future pairs.一旦记录与一对匹配，就应该将其从任何未来对的考虑中删除。

I have tried solutions involving ROW_NUMBER() and LEAD() , but i just can't quite get there.我已经尝试过涉及ROW_NUMBER()和LEAD()解决方案，但我只是无法到达那里。

This will be used to pair financial accounts with similar accounts for review, based on multiple customer attributes such as credit score, income, etc.这将用于根据信用评分、收入等多个客户属性，将金融账户与类似账户配对进行审查。

Statement:陈述：

declare @test table (ID numeric, Color varchar(20))
insert into @test values
        (1,'Blue'),(2,'Red'),(3,'Blue'),(4,'Yellow'),(5,'Blue'),(6,'Red')

select* 
from @test t1
join @test t2 
    on t1.Color = t2.Color
    and t1.ID < t2.ID           -----removes reverse-pairs and self-pairs

Current results:当前结果：

ID  Color   ID  Color
--- ------- --- --------
1   Blue    3   Blue
1   Blue    5   Blue        -----should not appear because 1 has already been paired
3   Blue    5   Blue        -----should not appear because 3 and 5 have already been paired
2   Red     6   Red

Needed results:需要的结果：

ID  Color   ID  Color
--- ------- --- --------
1   Blue    3   Blue
2   Red     6   Red

Answer 1

Editing with Max comments使用最大评论编辑

Here is one way to get this done..这是完成此操作的一种方法..

I first rank the records on the basis of color with the lowest id with rnk=1, next one with rnk=2.我首先根据 rnk=1 的最低 id 的颜色对记录进行排名，接下来是 rnk=2 的记录。

After that i join the tables together by pulling the rnk=1 records and joining then with rnk=2.之后，我通过拉出 rnk=1 记录然后与 rnk=2 连接来将表连接在一起。

declare @test table (ID numeric, Color varchar(20))
insert into @test values
        (1,'Blue'),(2,'Red'),(3,'Blue'),(4,'Yellow'),(5,'Blue'),(6,'Red'),(7,'Blue')

;with data
  as (select row_number() over(partition by color order by id asc) as rnk
            ,color
            ,id
       from @test
       )
select a.id,a.color,b.id,b.color
 from data a
 join data b
   on a.Color=b.Color
  and b.rnk=a.rnk+1
where a.rnk%2=1

i get the output as follows我得到如下输出

+----+-------+----+-------+
| id | color | id | color |
+----+-------+----+-------+
|  1 | Blue  |  3 | Blue  |
|  5 | Blue  |  7 | Blue  |
|  2 | Red   |  6 | Red   |
+----+-------+----+-------+

Answer 2

You could use row_number() and conditional aggregation:您可以使用row_number()和条件聚合：

select
    max(case when rn % 2 = 0 then id end) id1,
    max(case when rn % 2 = 0 then color end) color1,
    max(case when rn % 2 = 1 then id end) id2,
    max(case when rn % 2 = 1 then color end) color2
from (
    select
        t.*,
        row_number() over(partition by color order by id) - 1 rn
    from @test t
) t
group by color, rn / 2
having count(*) = 2

The subquery ranks records having the same color by increasing id .子查询通过增加id具有相同color的记录进行排名。 Then, the outer query groups pairwise, and filters on groups that do contain two records.然后，外部查询成对分组，并过滤包含两条记录的组。

Demo on DB Fiddle : DB Fiddle 上的演示：

id1 | color1 | id2 | color2
:-- | :----- | :-- | :-----
1   | Blue   | 3   | Blue  
2   | Red    | 6   | Red

在 SQL Server 中仅使用一次记录查找匹配对

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-03-10 16:14:53

解决方案2
1 2020-03-10 15:56:04

在 SQL Server 中仅使用一次记录查找匹配对

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-03-10 16:14:53

解决方案2 1 2020-03-10 15:56:04

解决方案1
2 已采纳 2020-03-10 16:14:53

解决方案2
1 2020-03-10 15:56:04