[英]SQL: Remove duplicates in self-join
I have the following table (called t1):我有下表(称为 t1):
| id | Name |
| 1 | Charlie |
| 2 | Bob |
| 3 | Alice |
I want to match the table with itself (self-join) but only choose a combination that has not already appeared.我想将表与自身匹配(自联接),但只选择尚未出现的组合。 So far, I have the following:
到目前为止,我有以下几点:
select * from t1 a, t1 b
where a.id != b.id
which gives me this result:这给了我这个结果:
| a.id | a.Name | b.id | b.Name |
| 2 | Bob | 1 | Charlie |
| 3 | Alice | 1 | Charlie |
| 1 | Charlie | 2 | Bob |
| 3 | Alice | 2 | Bob |
| 1 | Charlie | 3 | Alice |
| 2 | Bob | 3 | Alice |
I only want an id to appear once from table a, and once from table b.我只希望一个 id 从表 a 中出现一次,从表 b 中出现一次。 A desired outcome would be:
期望的结果是:
| a.id | a.Name | b.id | b.Name |
| 2 | Bob | 1 | Charlie |
| 3 | Alice | 2 | Bob |
| 1 | Charlie | 3 | Alice |
But I'm stumped as to how to guarantee this.但我对如何保证这一点感到困惑。
I am using SQL Server 2017.我正在使用 SQL Server 2017。
Here's a fiddle with my test: DEMO这是我的测试的小提琴: DEMO
PS: I've checked this question, but the concept of the solution using a "less than" as a comparison operator isn't clear to me in my own example. PS:我已经检查过这个问题,但是在我自己的例子中,使用“小于”作为比较运算符的解决方案的概念对我来说并不明确。
Edit: There are no rules as to which pair is chosen;编辑:没有关于选择哪一对的规则; the pairs could be (2,3), (3,1), (1,2) instead of the ones I presented above because the only rules I am interested in is having only once each id from table a and from table b , and a.id != b.id .
这些对可以是 (2,3), (3,1), (1,2) 而不是我上面介绍的那些,因为我感兴趣的唯一规则是表 a 和表 b 中的每个 id 只有一次,和a.id != b.id 。
Edit 2: There is no logic to match them, please think about it as this possible premise: I am matchmaking Alice, Bob and Charlie as if they are having a Secret Gift Exchange.编辑2:没有匹配它们的逻辑,请把它想象成这个可能的前提:我正在为爱丽丝、鲍勃和查理做媒,就好像他们在进行秘密礼物交换一样。 They could only offer a gift to one person, could only receive one gift, and could not offer a gift to themselves.
他们只能给一个人送礼物,只能收一件礼物,不能给自己送礼物。 (I think this allows scalability)
(我认为这允许可扩展性)
Here is one option which uses a ROW_NUMBER
trick to stagger each name with a different name:这是一种使用
ROW_NUMBER
技巧将每个名称与不同名称交错的选项:
WITH cte AS (
SELECT id, Name, ROW_NUMBER() OVER (ORDER BY id) rn
FROM t1
)
SELECT
t1.Name,
t2.Name
FROM cte t1
INNER JOIN cte t2
ON (t1.rn % (SELECT COUNT(*) FROM cte)) + 1 = t2.rn;
The logic is to just match row number 1 with 2, 2 with 3, and 3 with 1 (we use the modulus to wrap around at the edge case).逻辑是将行号 1 与 2、2 与 3、3 与 1 匹配(我们使用模数在边缘情况下环绕)。 This ensures that no name would ever appear more than once in a given column.
这确保在给定的列中不会出现任何名称多次。
OP want to match assign each person a random partner, the solution is not completely random and only works if the IDs are continuous. OP 想要匹配为每个人分配一个随机伙伴,该解决方案不是完全随机的,只有在 ID 连续时才有效。 However, it can be fixed by calling combining random/order_by/row_number
但是,它可以通过调用组合 random/order_by/row_number 来修复
my lazy fix is:我的懒惰修复是:
select * from t1 a, t1 b
where a.id = b.id % ( select count(*) from t1 c) + 1
Use row_number()
.使用
row_number()
。 Then do self join based on row number.然后根据行号进行自连接。
select a.id, a.name, b.id, b.name from
(select row_number() over (order by id desc) rn, id, name from t1) a
join
(select row_number() over (order by id asc) rn, id, name from t1) b on a.rn= b.rn
Here is another way to do this.这是执行此操作的另一种方法。
This partitions the data based on which of the two ids are greater and create a concatenated string (larger_id,'|',smaller_id)这将根据两个 id 中的哪个更大来对数据进行分区,并创建一个连接字符串 (larger_id,'|',smaller_id)
After that i am chosing just one value on the concatenated string by checking where rnk=1.之后,我通过检查 rnk=1 的位置在连接的字符串上只选择一个值。
with data
as (
select a.id a_id,a.name as a_name,b.name as b_name,b.id b_id
,row_number() over(partition by case when a.id>b.id then concat(a.id,'|',b.id)
else concat(b.id,'|',a.id) end
order by b.id desc)
as rnk
from t1 a
join t1 b
on a.id != b.id
)
select *
from data
where rnk=1
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=c3f82c8d21dc14899a263adacf1b31e6 https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=c3f82c8d21dc14899a263adacf1b31e6
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.