简体   繁体   English

SQL整齐地自我联接表n次

[英]SQL neatly self join a table n times

I'm building a back-end for a project to construct a (purely functional) graph database system. 我正在为一个项目构建一个后端,以构建一个(纯功能的)图形数据库系统。

One operation I'm trying to implement is self joining a precomputed view to itself exactly n times (ie finding pairs of results that are exactly n instances of a particular relation away from each other in the graph) 我正在尝试实现的一个操作是将预先计算的视图准确地自我连接到自身n次(即找到正好是图中特定关系的n个实例的结果对)。

A slightly ugly solution I've come up with is generating a big tree of joins in a style similar to exponentiation by squaring 我想出的一个稍微丑陋的解决方案是以类似于平方的求幂的方式生成一棵大的连接树

Is there a neater/better way to do this? 有没有更整洁/更好的方法呢? Perhaps using recursive queries? 也许使用递归查询?

Each possible pre-generated view has a left_id and right_id field, which represent indices into entity tables. 每个可能的预生成视图都有一个left_id和right_id字段,它们表示实体表中的索引。

A simplified example is the following: 下面是一个简化的示例:

Given the following table representing part of the relation n -> succ(n) (in reality the relation may be much more complicated) n -> succ(n)表代表关系n -> succ(n) (实际上,该关系可能要复杂得多)

left_id | right_id
__________________
   1    |    2
   2    |    3
   3    |    4
   4    |    5
   5    |    6
   7    |    8

The ideal result of querying with n = 3 would be n = 3的理想查询结果是

left_id | right_id
__________________
   1    |    4
   2    |    5
   3    |    6
   4    |    7
   5    |    8

(An explanation of why I'm using SQL is that one part of this project is to demonstrate that SQL is a poor choice for such a system, followed by a more bespoke backend to solve the issues with using an underlying relational as opposed to a graph model) (我之所以使用SQL的原因是,该项目的一部分是为了证明SQL对于这样的系统来说是一个糟糕的选择,然后是一个更为量身定制的后端,以使用基础关系而不是SQL来解决问题。图模型)

Many databases support recursive CTEs, which directly support such graph walking. 许多数据库都支持递归CTE,这些CTE直接支持这种图行走。

Even in databases that do not, this is not particularly difficult, using dynamic SQL and a WHERE clause: 即使在没有数据库的数据库中,使用动态SQL和WHERE子句也不会特别困难:

For n = 1: 对于n = 1:

select t1.left_id, t2.right_id
from t t1 join
     t t2
     on t2.left_id = t1.right_id;

For n = 2: 对于n = 2:

select t1.left_id, t3.right_id
from t t1 join
     t t2
     on t2.left_id = t1.right_id join
     t t3
     on t3.left_id = t2.right_id;

For n = 3: 对于n = 3:

select t1.left_id, t4.right_id
from t t1 join
     t t2
     on t2.left_id = t1.right_id join
     t t3
     on t3.left_id = t2.right_id join
     t t4
     on t4.left_id = t3.right_id;

Each additional value of "n" adds another join condition. 每个“ n”的附加值都会添加另一个join条件。 These queries can readily take advantage of an index on the table, so performance should be reasonable. 这些查询可以很容易地利用表上的索引,因此性能应该合理。

Whether this is better or worse than graph databases is somewhat moot. 这比图数据库好还是坏,在某种程度上尚无定论。 Each type of database has different strengths. 每种类型的数据库都有不同的优势。 Many relational databases do support recursive CTEs (and hence graph walking). 许多关系数据库确实支持递归CTE(并因此支持图形遍历)。 However, the performance is probably going to be better on a software system specifically designed for this purpose. 但是,在专门为此目的设计的软件系统上,性能可能会更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM