简体   繁体   English

SQL-确保一组键对中表示的两个实体都存在于最终数据集中的有效方法

[英]SQL - Efficient way to make sure both entities represented in a set of key Keypairs exist in final dataset

Question: How do I efficiently get a list of people (#People table below, one person per row) that are matched as a set (#Keys table below), but also make sure that the sets are unique. 问题:如何有效地获得一组匹配的人的列表(下面的#People表,每行一个人)(下面的#Keys表),还要确保这些集合是唯一的。

Background: I'm working with sets of matches in a database (in the form of KeyId, PersonId1, PersonId2). 背景:我正在处理数据库中的匹配项集(以KeyId,PersonId1,PersonId2的形式)。 We have an automatic method that flags people as duplicates and writes them to a Match table. 我们有一种自动方法,可将人标记为重复项并将其写入匹配表。 There aren't too many, but we usually sit with about 100K match records and 200K people. 人数不多,但我们通常坐着大约10万场比赛记录和20万名球员。 The match process has also added duplicate records in the form of matching Person 1 with 2, and also matching 2 with 1. Also, we logically delete people records (IsDeleted = 1) so do not want to return matches where one person has already been deleted. 匹配过程还添加了重复记录,形式是将Person 1与2匹配,也将2与1匹配。此外,我们在逻辑上删除人员记录(IsDeleted = 1),所以不想返回已经有一个人员的匹配项已删除。

We have an administration screen where people can look at the duplicates and flag whether they aren't dupes, or delete one of the pair. 我们有一个管理屏幕,人们可以在其中查看重复项,并标记它们是否不是重复项,或删除其中之一。 There has been a problem where even if one person in the pair was deleted, the other one was still showing in the list. 存在一个问题,即使删除了该对中的一个人,另一个人仍显示在列表中。 The SQL below is an attempt to make sure that only people that exist as a set are returned. 下面的SQL试图确保仅返回作为集合存在的人员。

Test Data Setup: 测试数据设置:

CREATE TABLE #Keys
(
    KeyId int PRIMARY KEY,
    PersonId1 int,
    PersonId2 int
)

CREATE TABLE #People
(
    PersonId int PRIMARY KEY,
    Name varchar(150),
    IsDeleted bit
)

INSERT INTO #People
VALUES  (1, 'John',0),
        (2, 'Madeline',0),
        (3, 'Ralph',1),
        (4, 'Sarah',0),
        (5, 'Jack',0),
        (6, 'Olivia',0),
        (7, 'Ethan',0),
        (8, 'Sophia',0)

INSERT INTO #Keys
VALUES  (1,1,2),
        (2,2,3),
        (3,1,3),
        (4,2,1),
        (5,4,8),
        (6,3,7),
        (7,6,1)

SELECT *
FROM #Keys k
 JOIN #People p1
    ON k.PersonId1 = p1.PersonId
    AND p1.IsDeleted = 0
 JOIN #People p2
    ON k.PersonId2 = p2.PersonId
    AND p2.IsDeleted = 0

Returns: 返回值:

KeyId   PersonId1   PersonId2   PersonId    Name    IsDeleted   PersonId    Name    IsDeleted
1   1   2   1   John    0   2   Madeline    0
4   2   1   2   Madeline    0   1   John    0
5   4   8   4   Sarah   0   8   Sophia  0
7   6   1   6   Olivia  0   1   John    0



SELECT KeyId, p1.PersonId, p1.Name
INTO #Results
FROM #Keys k
 JOIN #People p1
    ON k.PersonId1 = p1.PersonId
    AND p1.IsDeleted = 0
 JOIN #People p2
    ON k.PersonId2 = p2.PersonId
    AND p2.IsDeleted = 0

INSERT INTO #Results
SELECT KeyId, p2.PersonId, p2.Name
FROM #Keys k
 JOIN #People p1
    ON k.PersonId1 = p1.PersonId
    AND p1.IsDeleted = 0
 JOIN #People p2
    ON k.PersonId2 = p2.PersonId
    AND p2.IsDeleted = 0

SELECT * from #Results
order by KeyId

DROP TABLE #People
DROP TABLE #Keys
DROP TABLE #Results

The final query returns this set: 最终查询返回以下集合:

KeyId   PersonId    Name
1   2   Madeline
1   1   John
4   2   Madeline
4   1   John
5   8   Sophia
5   4   Sarah
7   6   Olivia
7   1   John

But it has the problem that Keys 1 and 4 have the same people, just reversed in order. 但是,问题在于键1和键4具有相同的人员,只是顺序相反。 The set I'd like returned is: 我想要返回的集合是:

KeyId   PersonId    Name
1   2   Madeline
1   1   John
5   4   Sarah
5   8   Sophia
7   1   John
7   6   Olivia

First I would make 首先我会

PersonId1 int,
PersonId2 int

The PK on #Keys and drop KeyId #Keys上的PK并放下KeyId

A quick way to get the unique 快速获得独特的方法

select PersonId1, PersonId2 
from keys 
where PersonId1 < PersonId2 
union 
select PersonId2, PersonId1 
from keys 
where PersonId2 < PersonId1 

Clearly you would need to add in the join on deleted 显然,您需要在删除时添加联接

You could also put a constraint on #Keys that PersonId1 < PersonId2 您还可以对#Keys施加一个约束,使PersonId1 <PersonId2

I think this will work also 我认为这也会起作用

select PersonId1, PersonId2 from keys 
except 
select PersonId2, PersonId1 from keys 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何确保在SQL的另一个表中未设置外键? - How to make sure that a foreign key is not set in another table in SQL? 使用 SQL 比较两列,以确保两者中存在相同的一组值 - Comparing two columns using SQL to make sure the same set of values are present in both 有没有办法使此SQL更有效? - Is there a way to make this SQL more efficient? 有没有办法检查一个数据集中的哪些列存在于另一个数据集中? sql - Is there a way to check what columns in one dataset exist in another dataset? sql SQL:如果不执行所有SELECT Case,我怎样才能确保所有SELECT Case都被表示? - SQL: How can I make sure all SELECT Cases are represented even if not all are executed? 为Django模型中所有实体的属性设置相同值的有效方法 - An efficient way to set the same value to an attribute for all entities in a django model 有没有办法让这个 SQL 查询更有效率? - Is there a way to make this SQL query more efficient? SQL-查找一对行是否不存在的最有效方法 - SQL - most efficient way to find if a pair of row does NOT exist 检查文件是否存在并相应更新SQL db的有效方法 - Efficient way to check if files exist and update SQL db accordingly 更新实体列表的有效方法 - Efficient way of updating list of entities
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM