[英]What is the best way to model hierarchical data in SQL?
I have relationship data in the form:我有以下形式的关系数据:
Parent ID ParentName ParentType RelatedToID RelatedToName RelatedType
----------------------------------------------------------------------------
1 A Business 2 B Individual
1 A Business 4 D Business
1 A Business 3 C Business
1 A Business 6 F Business
1 A Business 3 C Business
1 A Business 9 I Business
1 A Business 9 I Business
1 A Business 3 C Business
1 A Business 12 L Business
1 A Business 5 E Business
2 B Individual 1 A Business
2 B Individual 3 C Business
2 B Individual 3 C Business
2 B Individual 6 F Business
2 B Individual 3 C Business
2 B Individual 4 D Business
2 B Individual 4 D Business
3 C Business 1 A Business
3 C Business 1 A Business
3 C Business 2 B Individual
3 C Business 10 J Business
3 C Business 6 F Business
3 C Business 14 N Business
3 C Business 4 D Business
3 C Business 7 G Business
3 C Business 1 A Business
3 C Business 2 B Individual
4 D Business 2 B Individual
4 D Business 3 C Business
4 D Business 3 C Business
4 D Business 10 J Business
4 D Business 1 A Business
4 D Business 1 A Business
4 D Business 7 G Business
5 E Business 1 A Business
5 E Business 1 A Business
6 F Business 2 B Individual
6 F Business 1 A Business
6 F Business 3 C Business
6 F Business 3 C Business
6 F Business 1 A Business
7 G Business 3 C Business
7 G Business 4 D Business
7 G Business 3 C Business
7 G Business 3 C Business
8 H Individual 9 I Business
8 H Individual 9 I Business
9 I Business 1 A Business
9 I Business 8 H Individual
10 J Business 3 C Business
10 J Business 3 C Business
10 J Business 13 M Business
10 J Business 3 C Business
10 J Business 4 D Business
10 J Business 11 K Individual
11 K Individual 10 J Business
11 K Individual 13 M Business
11 K Individual 10 J Business
11 K Individual 13 M Business
12 L Business 1 A Business
13 M Business 11 K Individual
13 M Business 10 J Business
I'm using DiagrammeR to make a relationship chart based on this data.我正在使用 DiagrammeR 根据这些数据制作关系图。 I need to transform this data in SQL to feed into graphviz.
我需要将 SQL 中的数据转换为 graphviz。 ie:
IE:
I'm willing to go 5 levels into the relationship tree in my data prep step.我愿意在我的数据准备步骤中将 go 5 个级别放入关系树中。 Ultimately this example above should reduce to this:
最终,上面的示例应简化为:
ReadySet就绪集
A->B
A->D
A->C
A->F
A->I
A->L
A->E
B->C
B->F
B->D
C->J
C->F
C->N
C->D
C->G
D->J
D->G
H->I
J->M
J->K
K->M
which in GraphViz results in:这在 GraphViz 中导致:
What I have tried and my difficulty:我尝试过的和遇到的困难:
I started with defining the parents in which the ParentType = 'Individual' I then used self joining to obtain the hagiarchy on a row level.我首先定义了 ParentType = 'Individual' 的父母,然后我使用自连接在行级别上获得了 hagiarchy。
What I want (and can't seem to do) is, produce a single SQL table that will produce ReadySet if a user makes a selection Name that is contained within the breadth of the relationship tree ie if A is selected then ReadySet or if M is selected then ReadySet … There are obviously more parent ID's/names in my entire dataset.我想要(但似乎不能做)的是,生成一个 SQL 表,如果用户选择包含在关系树范围内的名称,则生成ReadySet ,即如果选择 A,则ReadySet或者如果 M被选中,然后ReadySet ……我的整个数据集中显然有更多的父 ID/名称。
You need a table to define the edges between the nodes and their direction.您需要一个表来定义节点之间的边及其方向。
create table edges (
from_id bigint not null references nodes(id),
to_id bigint not null references nodes(id),
primary key(from_id, to_id);
);
"nodes" here is whatever table actually holds the data.这里的“节点”是实际保存数据的任何表。
If the relationship is two way, we need to rows, one for A -> B and one for B -> A.如果关系是双向的,我们需要行,一个用于 A -> B,一个用于 B -> A。
Then do a recursive CTE to find all the matching rows.然后执行递归 CTE以查找所有匹配的行。
with recursive ready_set as (
select *
from edges
where from_id = ?
union
select e.*
from edges e
inner join ready_set rs on e.from_id = rs.to_id
)
select *
from ready_set;
The first part of the union is the starting condition, and the second is the recursion which joins with the CTE. union 的第一部分是起始条件,第二部分是与 CTE 连接的递归。
For example, if we set up our edges like so:例如,如果我们像这样设置边缘:
1 <-> 2 <-> 3 -> 4
1 <-> 5 <- 6
insert into edges values
(1, 2), (2, 1),
(2, 3), (3, 2),
(3, 4),
(1, 5), (5, 1),
(6, 5);
Everything has a path from each other, except 6 which is one way to 5. If we ask for 5, we'll get all the edges except 6, 5. If we ask for 4, we'll get nothing because 4 has no outgoing connections.一切都有一条路径,除了 6 是到 5 的一种方式。如果我们要求 5,我们将获得除 6、5 之外的所有边缘。如果我们要求 4,我们将一无所获,因为 4 没有传出连接。
And if we select distinct to_id from ready_set;
如果我们
select distinct to_id from ready_set;
starting at 5 we'll get just the node IDs 1, 2, 3, 4, and 5. No 6.从 5 开始,我们将只得到节点 ID 1、2、3、4 和 5。没有 6。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.