[英]Optimal join in two MySQL tables
I have a table (T1) with ca. 我有一张桌子(T1)与ca. 500000 non duplicate records:
500000非重复记录:
ID1 Relation ID2
4 Rel4 13
5 Rel5 4
13 Rel13 16
16 Rel16 5
I have the properties table T1_Prop: 我有属性表T1_Prop:
ID Entity
4 Ent4
5 Ent5
13 Ent13
16 Ent16
I want to join these two tables (based on id : 4) in an efficient way as follows: 我想以一种有效的方式加入这两个表(基于id:4),如下所示:
Entity Relation Entity
Ent4 Rel4 Ent13
Ent5 Rel5 EntEnt4
I designed this select statement including JOIN which works fine. 我设计了这个select语句,包括JOIN,工作正常。 However, I am not sure if this the best way to do:
但是,我不确定这是否是最好的方法:
select
a.entity,
r.relation,
b.entity
from T1 as r
INNER JOIN T1_Prop as a ON a.ID=r.ID1 AND (r.ID1=4 OR r.ID2=4)
INNER JOIN T1_Prop as b ON b.ID=r.ID2;
This is a fine use of SQL. 这是SQL的很好用。 It's built for this kind of query.
它是为这种查询而构建的。
You'll need two covering indexes to speed this up, on T1
. 在
T1
上,你需要两个覆盖索引才能加快速度。 They are: 他们是:
(ID1, ID2, relation)
and 和
(ID2, ID1, relation)
The two indexes are for handling the OR
clause. 这两个索引用于处理
OR
子句。 It is the only potential performance issue I see, and that's just because OR
operations sometimes trick the query planner into doing too much table scanning. 这是我看到的唯一潜在性能问题,这只是因为
OR
操作有时会欺骗查询规划器进行过多的表扫描。
Try refactoring your query to this to make your selection of ID values more visible. 尝试重构您的查询,以使您的ID值选择更加明显。
select a.entity, r.relation, b.entity
from T1 as r
INNER JOIN T1_Prop as a ON a.ID=r.ID1
INNER JOIN T1_Prop as b ON b.ID=r.ID2
WHERE (r.ID1=4 OR r.ID2=4)
Then, if you have trouble with performance, after you create the covering indexes, refactor it again to 然后,如果您遇到性能问题,在创建覆盖索引之后,再次将其重构为
select a.entity, r.relation, b.entity
from T1 as r
INNER JOIN T1_Prop as a ON a.ID=r.ID1
INNER JOIN T1_Prop as b ON b.ID=r.ID2
WHERE r.ID1=4
UNION
select a.entity, r.relation, b.entity
from T1 as r
INNER JOIN T1_Prop as a ON a.ID=r.ID1
INNER JOIN T1_Prop as b ON b.ID=r.ID2
WHERE r.ID2=4
Your query looks fine except for the first ON
clause. 除第一个
ON
子句外,您的查询看起来很好。 The condition (r.ID1=4 OR r.ID2=4)
is not a rule for which record from T1_Prop
to join to the T1
record. 条件
(r.ID1=4 OR r.ID2=4)
不是T1_Prop
哪条记录加入T1
记录的规则。 It is rather a condition, which T1
records to consider and belongs hence in the WHERE
clause. 这是一个条件,
T1
记录要考虑并因此属于WHERE
子句。
select
a.entity AS entity1,
r.relation,
b.entity AS entity2
FROM t1 AS r
INNER JOIN t1_prop AS a ON a.id = r.id1
INNER JOIN t1_prop AS b ON b.id = r.id2
WHERE r.id1 = 4 OR r.id2 = 4;
This won't change the execution plan; 这不会改变执行计划; the DBMS will execute this just the same.
DBMS将执行此操作。 But it's more readable as it shows the actual intention: get relations where one of the IDs is 4 and join the entities to those relations.
但它更具可读性,因为它显示了实际意图:获得其中一个ID为4的关系,并将实体连接到这些关系。
Another option to show this intention is: 显示此意图的另一个选择是:
select
a.entity AS entity1,
r.relation,
b.entity AS entity2
FROM (SELECT * FROM t1 WHERE r.id1 = 4 OR r.id2 = 4) AS r
INNER JOIN t1_prop AS a ON a.id = r.id1
INNER JOIN t1_prop AS b ON b.id = r.id2;
Some consider subqueries in FROM less readable, but, well, others don't. 有些人认为FROM中的子查询不太可读,但是,其他人则不这么认为。 And when queries get more complex and say you even deal with aggregates from different tables, this is often the way to go and build a clean query.
当查询变得更复杂并且说您甚至处理来自不同表的聚合时,这通常是构建干净查询的方法。
Neither of above queries is actually better or worse than the other. 上述任何一种查询实际上都不比另一种更好或更差。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.