简体   繁体   English

递归SQL查询以查找所有匹配的标识符

[英]Recursive SQL query to find all matching identifiers

I have a table with following structure 我有一个具有以下结构的表

CREATE TABLE Source
(
     [ID1] INT, 
     [ID2] INT
);

INSERT INTO Source ([ID1], [ID2]) 
VALUES (1, 2), (2, 3), (4, 5),
       (2, 5), (6, 7)

Example of Source and Result tables: 源和结果表示例:

在此处输入图片说明

Source table basically stores which id is matching which another id. 源表基本上存储哪个ID与另一个ID匹配。 From the diagram it can be seen that 1, 2, 3, 4, 5 are identical. 从图中可以看出,1、2、3、4、5是相同的。 And 6, 7 are identical. 与6、7相同。 I need a SQL query to get a Result table with all matches between ids. 我需要一个SQL查询来获取带有ID之间所有匹配项的结果表。

I found this item on the site - Recursive query in SQL Server similar to my task, but with a different result. 我在网站上找到此项目-SQL Server中的递归查询与我的任务类似,但是结果不同。

I tried to edit the code for my task, but it does not work. 我试图为我的任务编辑代码,但是它不起作用。 "The statement terminated. The maximum recursion 100 has been exhausted before statement completion." “语句终止。在语句完成之前,最大递归100已用尽。”

;WITH CTE
AS
(
    SELECT DISTINCT
        M1.ID1,
        M1.ID1 as ID2
    FROM Source M1
        LEFT JOIN Source M2
            ON M1.ID1 = M2.ID2
    WHERE M2.ID2 IS NULL
    UNION ALL
    SELECT
        C.ID2,
        M.ID1
    FROM CTE C
        JOIN Source M
            ON C.ID1 = M.ID1
)
SELECT * FROM CTE ORDER BY ID1

Thanks a lot for the help! 非常感谢您的帮助!

This is a challenging question. 这是一个具有挑战性的问题。 You are trying to walk through a graph in two directions. 您正在尝试从两个方向浏览图表。 There are two key ideas: 有两个主要想法:

  • Add "reverse" edges, so the graph behaves like a digraph but with edges in both directions. 添加“反向”边缘,因此图形的行为就像有向图,但在两个方向上都有边缘。
  • Keep a list of edges that have been visited. 保留已访问过的边的列表。 In SQL Server, strings are one method. 在SQL Server中,字符串是一种方法。

So: 所以:

with s as (
      select id1, id2 from source
      union  -- on purpose
      select id2, id1 from source
     ),
     cte as (
      select s.id1, s.id2, ',' + cast(s.id1 as varchar(max)) + ',' + cast(s.id2 as varchar(max)) + ',' as ids
      from s
      union all
      select cte.id1, s.id2, ids + cast(s.id2 as varchar(max)) + ','
      from cte join
           s
           on cte.id2 = s.id1
      where cte.ids not like '%,' + cast(s.id2 as varchar(max)) + ',%'
     )
select *
from cte
order by 1, 2;

Here is a db<>fiddle . 这是db <> fiddle

  1. Since all node connections are bidirectional - add reversed relations to the original list 由于所有节点连接都是双向的-将反向关系添加到原始列表
  2. Find all possible paths from each node; 查找每个节点的所有可能路径; almost usual recursion, the only difference is - we need to keep root id1 几乎通常的递归,唯一的区别是-我们需要保留root id1
  3. Avoid cycles - we need to be aware of it because we don't have directions 避免循环-我们需要注意这一点,因为我们没有方向

source: 资源:

;with src as(
  select id1, id2 from source
  union 
  -- reversed connections
  select id2, id1 from source
), rec as (
  select id1, id2, CAST(CONCAT('/', src.id1, '/', src.id2, '/') as varchar(8000)) path
  from src

  union all

  -- keep the root id1 from the start of each path
  select rec.id1, src.id2, CAST(CONCAT(rec.path, src.id2, '/') as varchar(8000))
  from rec
  -- usual recursion
  inner join src on src.id1 = rec.id2
  -- avoid cycles
  where rec.path not like CONCAT('%/', src.id2, '/%')
)
select id1, id2, path 
from rec
order by 1, 2

output 产量

| id1 | id2 |      path |
|-----|-----|-----------|
|   1 |   2 |     /1/2/ |
|   1 |   3 |   /1/2/3/ |
|   1 |   4 | /1/2/5/4/ |
|   1 |   5 |   /1/2/5/ |
|   2 |   1 |     /2/1/ |
|   2 |   3 |     /2/3/ |
|   2 |   4 |   /2/5/4/ |
|   2 |   5 |     /2/5/ |
|   3 |   1 |   /3/2/1/ |
|   3 |   2 |     /3/2/ |
|   3 |   4 | /3/2/5/4/ |
|   3 |   5 |   /3/2/5/ |
|   4 |   1 | /4/5/2/1/ |
|   4 |   2 |   /4/5/2/ |
|   4 |   3 | /4/5/2/3/ |
|   4 |   5 |     /4/5/ |
|   5 |   1 |   /5/2/1/ |
|   5 |   2 |     /5/2/ |
|   5 |   3 |   /5/2/3/ |
|   5 |   4 |     /5/4/ |
|   6 |   7 |     /6/7/ |
|   7 |   6 |     /7/6/ |

http://sqlfiddle.com/#!18/76114/13 http://sqlfiddle.com/#!18/76114/13

source table will contain about 100,000 records 源表将包含大约100,000条记录

There is nothing that can help you with this. 没有什么可以帮助您。 The task is unpleasant - finding all possible connections. 这项任务令人不快-找到所有可能的连接。 Almost CROSS JOIN . 几乎CROSS JOIN With even more connections in the end. 最后还有更多的连接。

Looks like I came up with a similar answer as the other posters. 看起来我想出了和其他海报类似的答案。 My approach was to insert the existing value pairs, and then insert the reverse of each pair. 我的方法是插入现有的值对,然后插入每个对的相反值。

Once you expand the list of value pairs, you can transverse the table to find all the pairs. 展开值对列表后,您可以横向查看表以找到所有对。

CREATE TABLE #Source
    ([ID1] int, [ID2] int);

INSERT INTO #Source 
(
    [ID1]
    ,[ID2]
) 
VALUES   
(1, 2)
,(2, 3)
,(4, 5)
,(2, 5)
,(6, 7)

INSERT INTO #Source 
(
    [ID1]
    ,[ID2]
) 
SELECT 
    [ID2]
    ,[ID1] 
FROM #Source

;WITH expanded AS
(
    SELECT DISTINCT 
        ID1 = s1.ID1
        ,ID2 = s1.ID2
    FROM #Source s1
    LEFT JOIN #Source s2 ON s1.ID2 = s2.ID1

    UNION

    SELECT DISTINCT 
        ID1 = s1.ID1
        ,ID2 = s2.ID2
    FROM #Source s1
    LEFT JOIN #Source s2 ON s1.ID2 = s2.ID1
    WHERE s1.ID1 <> s2.ID2

)
,recur AS
(
    SELECT DISTINCT 
        e1.ID1
        ,e1.ID2
    FROM expanded e1
    LEFT JOIN expanded e2 ON e1.ID2 = e2.ID1
    WHERE e1.ID1 <> e1.ID2

    UNION ALL

    SELECT DISTINCT 
        e1.ID1
        ,e2.ID2
    FROM expanded e1
    INNER JOIN expanded e2 ON e1.ID2 = e2.ID1
    WHERE e1.ID1 <> e2.ID2
)
SELECT DISTINCT 
    ID1, ID2 
FROM recur
ORDER BY ID1, ID2

DROP TABLE #Source 

This is a way to get that output by brute force, but may not be the best solution with a different/larger data set: 这是通过蛮力获得输出的一种方法,但对于不同/更大的数据集,可能不是最佳解决方案:

select sub1.rnk as ID1
,sub2.rnk as ID2
from
(
select a.*
,rank() over (partition by 1 order by id1, id2) as RNK
from source a
) sub1
cross join
(
select a.*
,rank() over (partition by 1 order by id1, id2) as RNK
from source a
) sub2
where sub1.rnk <> sub2.rnk
union all
select id1 as ID1
,id2 as ID2
from source
where id1 = 6
union all
select id2 as ID1
,id1 as ID2
from source
where id1 = 6;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM