[英]SQL - Efficient way to make sure both entities represented in a set of key Keypairs exist in final dataset
Question: How do I efficiently get a list of people (#People table below, one person per row) that are matched as a set (#Keys table below), but also make sure that the sets are unique. 问题:如何有效地获得一组匹配的人的列表(下面的#People表,每行一个人)(下面的#Keys表),还要确保这些集合是唯一的。
Background: I'm working with sets of matches in a database (in the form of KeyId, PersonId1, PersonId2). 背景:我正在处理数据库中的匹配项集(以KeyId,PersonId1,PersonId2的形式)。 We have an automatic method that flags people as duplicates and writes them to a Match table.
我们有一种自动方法,可将人标记为重复项并将其写入匹配表。 There aren't too many, but we usually sit with about 100K match records and 200K people.
人数不多,但我们通常坐着大约10万场比赛记录和20万名球员。 The match process has also added duplicate records in the form of matching Person 1 with 2, and also matching 2 with 1. Also, we logically delete people records (IsDeleted = 1) so do not want to return matches where one person has already been deleted.
匹配过程还添加了重复记录,形式是将Person 1与2匹配,也将2与1匹配。此外,我们在逻辑上删除人员记录(IsDeleted = 1),所以不想返回已经有一个人员的匹配项已删除。
We have an administration screen where people can look at the duplicates and flag whether they aren't dupes, or delete one of the pair. 我们有一个管理屏幕,人们可以在其中查看重复项,并标记它们是否不是重复项,或删除其中之一。 There has been a problem where even if one person in the pair was deleted, the other one was still showing in the list.
存在一个问题,即使删除了该对中的一个人,另一个人仍显示在列表中。 The SQL below is an attempt to make sure that only people that exist as a set are returned.
下面的SQL试图确保仅返回作为集合存在的人员。
Test Data Setup: 测试数据设置:
CREATE TABLE #Keys
(
KeyId int PRIMARY KEY,
PersonId1 int,
PersonId2 int
)
CREATE TABLE #People
(
PersonId int PRIMARY KEY,
Name varchar(150),
IsDeleted bit
)
INSERT INTO #People
VALUES (1, 'John',0),
(2, 'Madeline',0),
(3, 'Ralph',1),
(4, 'Sarah',0),
(5, 'Jack',0),
(6, 'Olivia',0),
(7, 'Ethan',0),
(8, 'Sophia',0)
INSERT INTO #Keys
VALUES (1,1,2),
(2,2,3),
(3,1,3),
(4,2,1),
(5,4,8),
(6,3,7),
(7,6,1)
SELECT *
FROM #Keys k
JOIN #People p1
ON k.PersonId1 = p1.PersonId
AND p1.IsDeleted = 0
JOIN #People p2
ON k.PersonId2 = p2.PersonId
AND p2.IsDeleted = 0
Returns: 返回值:
KeyId PersonId1 PersonId2 PersonId Name IsDeleted PersonId Name IsDeleted
1 1 2 1 John 0 2 Madeline 0
4 2 1 2 Madeline 0 1 John 0
5 4 8 4 Sarah 0 8 Sophia 0
7 6 1 6 Olivia 0 1 John 0
SELECT KeyId, p1.PersonId, p1.Name
INTO #Results
FROM #Keys k
JOIN #People p1
ON k.PersonId1 = p1.PersonId
AND p1.IsDeleted = 0
JOIN #People p2
ON k.PersonId2 = p2.PersonId
AND p2.IsDeleted = 0
INSERT INTO #Results
SELECT KeyId, p2.PersonId, p2.Name
FROM #Keys k
JOIN #People p1
ON k.PersonId1 = p1.PersonId
AND p1.IsDeleted = 0
JOIN #People p2
ON k.PersonId2 = p2.PersonId
AND p2.IsDeleted = 0
SELECT * from #Results
order by KeyId
DROP TABLE #People
DROP TABLE #Keys
DROP TABLE #Results
The final query returns this set: 最终查询返回以下集合:
KeyId PersonId Name
1 2 Madeline
1 1 John
4 2 Madeline
4 1 John
5 8 Sophia
5 4 Sarah
7 6 Olivia
7 1 John
But it has the problem that Keys 1 and 4 have the same people, just reversed in order. 但是,问题在于键1和键4具有相同的人员,只是顺序相反。 The set I'd like returned is:
我想要返回的集合是:
KeyId PersonId Name
1 2 Madeline
1 1 John
5 4 Sarah
5 8 Sophia
7 1 John
7 6 Olivia
First I would make 首先我会
PersonId1 int,
PersonId2 int
The PK on #Keys and drop KeyId #Keys上的PK并放下KeyId
A quick way to get the unique 快速获得独特的方法
select PersonId1, PersonId2
from keys
where PersonId1 < PersonId2
union
select PersonId2, PersonId1
from keys
where PersonId2 < PersonId1
Clearly you would need to add in the join on deleted 显然,您需要在删除时添加联接
You could also put a constraint on #Keys that PersonId1 < PersonId2 您还可以对#Keys施加一个约束,使PersonId1 <PersonId2
I think this will work also 我认为这也会起作用
select PersonId1, PersonId2 from keys
except
select PersonId2, PersonId1 from keys
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.