简体   繁体   English

将SQL表与自身进行比较(自联接)

[英]Comparing SQL Table to itself (Self-join)

I'm trying to find duplicate rows based on mixed columns. 我正在尝试根据混合列找到重复的行。 This is an example of what I have: 这是我的一个例子:

CREATE TABLE Test
(
   id INT PRIMARY KEY,
   test1 varchar(124),
   test2 varchar(124)
)

INSERT INTO TEST ( id, test1, test2 ) VALUES ( 1, 'A', 'B' )
INSERT INTO TEST ( id, test1, test2 ) VALUES ( 2, 'B', 'C' )

Now if I run this query: 现在,如果我运行此查询:

SELECT [LEFT].[ID] 
FROM [TEST] AS [LEFT] 
   INNER JOIN [TEST] AS [RIGHT] 
   ON [LEFT].[ID] != [RIGHT].[ID] 
WHERE [LEFT].[TEST1] = [RIGHT].[TEST2]

I would expect to get back both id's. 我希望能找回两个id。 (1 and 2), however I only ever get back the one row. (1和2),但我只回到了一排。

My thoughts would be that it should compare each row, but I guess this is not correct? 我的想法是它应该比较每一行,但我想这不正确? To fix this I had changed my query to be: 为了解决这个问题,我将查询更改为:

SELECT [LEFT].[ID] 
FROM [TEST] AS [LEFT] 
   INNER JOIN [TEST] AS [RIGHT] 
   ON [LEFT].[ID] != [RIGHT].[ID] 
WHERE [LEFT].[TEST1] = [RIGHT].[TEST2] 
OR [LEFT].[TEST2] = [RIGHT].[TEST1]

Which gives me both rows, but the performance degrades extremely quickly based on the number of rows. 这给了我两行,但性能根据行数极快地降低。

The final solution I came up for for performance and results was to use a union: 我为性能和结果找到的最终解决方案是使用联合:

SELECT [LEFT].[ID] 
FROM [TEST] AS [LEFT] 
   INNER JOIN [TEST] AS [RIGHT] 
   ON [LEFT].[ID] != [RIGHT].[ID] 
WHERE [LEFT].[TEST1] = [RIGHT].[TEST2] 
UNION
SELECT [LEFT].[ID] 
FROM [TEST] AS [LEFT] 
   INNER JOIN [TEST] AS [RIGHT] 
   ON [LEFT].[ID] != [RIGHT].[ID] 
WHERE [LEFT].[TEST2] = [RIGHT].[TEST1]

But overall, I'm obviously missing an understanding of why this is not working which means that I'm probably doing something wrong. 但总的来说,我显然不理解为什么这不起作用,这意味着我可能做错了什么。 Could someone point me in the proper direction? 有人能指出我正确的方向吗?

Do not JOIN on an inequality; 不要加入不平等; it seems that the JOIN and WHERE conditions are inverted. 似乎JOIN和WHERE条件被反转。

SELECT t1.id
FROM Test t1
INNER JOIN Test t2
ON ((t1.test1 = t2.test2) OR (t1.test2 = t2.test1))
WHERE t1.id <> t2.id

Should work fine. 应该工作正常。

You only get back both id's if you select them: 如果您选择它们​​,您只能取回两个ID:

SELECT [LEFT].[ID], [RIGHT].[ID] 
FROM [TEST] AS [LEFT] 
   INNER JOIN [TEST] AS [RIGHT] 
   ON [LEFT].[ID] != [RIGHT].[ID] 
WHERE [LEFT].[TEST1] = [RIGHT].[TEST2]

The reason that only get one ROW is that only one row (namely row #2) has a TEST1 that is equal to another row's TEST2. 只获得一个ROW的原因是只有一行(即第2行)的TEST1等于另一行的TEST2。

I looks like you're working very quickly toward a Cartiesian Join . 我看起来你很快就开始了Cartiesian Join Normally if you're looking to return duplicates, you need to run something like: 通常,如果您要返回重复项,则需要执行以下操作:

SELECT [LEFT].*
FROM [TEST]  AS [LEFT]
INNER JOIN [TEST] AS [RIGHT]
    ON [LEFT].[test1] = [RIGHT].[test1]
        AND [LEFT].[test2] = [RIGHT].[test2]
        AND [LEFT].[id] <> [RIGHT].[id]

If you need to mix the columns, then mix the needed conditions, but do something like: 如果您需要混合列,然后混合所需的条件,但执行以下操作:

SELECT [LEFT].*
FROM [TEST] AS [LEFT]
INNER JOIN [TEST] AS [RIGHT]
    ON (
        [LEFT].[test1] = [RIGHT].[test2]
            OR [LEFT].[test2] = [RIGHT].[test1]
       )
        AND [LEFT].[id] <> [RIGHT].[id]

Using that, you compare the right to the left and the left to the right in each join, eliminating the need for the WHERE altogether. 使用它,您可以在每个连接中比较左侧和右侧的右侧,完全不需要WHERE。

However, this style of query grows exponentially in execution time for each row inserted into the table, since you're comparing each row to every row. 但是,这种查询样式在插入表中的每一行的执行时间中呈指数级增长,因为您要将每行与每行进行比较。

This can be done with out inner joins if I am not mistaken. 如果我没有弄错的话,这可以通过内连接来完成。 This my first time answering mysql kind of question but I am just answering to get more points here on StackOverflow. 这是我第一次回答mysql的问题,但我只是回答在StackOverflow上获得更多积分。 The comma is very important so that mysql does not complain. 逗号是非常重要的,以便mysql不会抱怨。

SELECT [LEFT].[ID] FROM [TEST] AS [LEFT], [TEST] AS [RIGHT] 
WHERE [LEFT].[ID] != [RIGHT].[ID] 
AND [LEFT].[TEST1] = [RIGHT].[TEST2];

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM