[英]Optimize Oracle query with Hints
I have this complex Oracle query.我有这个复杂的 Oracle 查询。
SELECT
r3.object_id
FROM
nc_references r
INNER JOIN nc_objects o ON r.object_id = o.object_id
AND o.object_class_id = 90000330 /* Work Item */
INNER JOIN nc_params p ON p.object_id = o.object_id
AND p.attr_id = 90100070 /* State */ AND p.list_value_id IN ( 90100071 /* Ready */, 90100072 /* Active */ )
INNER JOIN nc_po_actions poa ON poa.manual_task_id = o.object_id
INNER JOIN nc_po_tasks pot ON pot.task_id = poa.task_id
INNER JOIN nc_references r1 ON r1.object_id = pot.container_id
AND r1.attr_id = 9145923960313063683 /* Wave ID */
INNER JOIN nc_references r2 ON r2.object_id = pot.container_id
AND r2.attr_id = 9145685312013687931 /* Product OrderID */
INNER JOIN nc_references r3 ON r1.reference = r3.reference
AND r3.attr_id = 9145065302013613216 /* Project Wave */ AND r2.reference = r3.object_id
INNER JOIN nc_objects o2 ON r3.object_id = o2.object_id
WHERE
r.reference = 9155224548713314821
AND r.attr_id = 90100080 /* Assigned To */
AND o.object_type_id = 9146598858613093106
I tried some optimization:我尝试了一些优化:
SELECT
r3.object_id
FROM
nc_references src
INNER JOIN nc_objects o ON o.object_type_id = 9146598858613093106
AND o.object_class_id = 90000330 /* Work Item */
AND src.object_id = o.object_id
AND src.reference = 9155224548713314821
AND src.attr_id = 90100080 /* Assigned To */
INNER JOIN nc_params p ON ( p.list_value_id = 90100071 /* Ready */ OR p.list_value_id = 90100072 /* Active */ )
AND p.object_id = o.object_id
AND p.attr_id = 90100070 /* State */
INNER JOIN nc_po_actions poa ON poa.manual_task_id = o.object_id
INNER JOIN nc_po_tasks pot ON pot.task_id = poa.task_id
INNER JOIN nc_references r1 ON r1.object_id = pot.container_id
AND r1.attr_id = 9145923960313063683 /* Wave ID */
INNER JOIN nc_references r2 ON r2.object_id = pot.container_id
AND r2.attr_id = 9145685312013687931 /* Product OrderID */
INNER JOIN nc_references r3 ON r2.reference = r3.object_id
AND r3.attr_id = 9145065302013613216 /* Project Wave */
AND r1.reference = r3.reference;
But I get more than 40+ seconds of execution.但是我得到了 40 多秒的执行时间。 Looks like because list of
tasks
is quite huge.看起来是因为
tasks
列表非常庞大。
Do you know how I can add Hints into this query in order to improve the performance?您知道如何将提示添加到此查询中以提高性能吗?
Edit : Explain plan编辑:解释计划
SELECT /*+ gather_plan_statistics MIDR03*/ r3.object_id
FROM
nc_references src
INNER JOIN nc_objects o ON o.object_type_id = 9146598858613093106
AND o.object_class_id = 90000330 /* Work Item */ AND src.object_id = o.object_id
AND src.reference = 9155224548713314821
AND src.attr_id = 90100080 /* Assigned To */ INNER JOIN nc_params p ON ( p.list_value_id = 90100071 /* Ready */ OR p.list_value_id =
90100072
/* Active */ )
AND p.object_id = o.object_id
AND p.attr_id = 90100070 /*
State */ INNER JOIN nc_po_actions poa ON poa.manual_task_id = o.object_id
INNER JOIN nc_po_tasks pot ON pot.task_id = poa.task_id
INNER JOIN nc_references r1 ON r1.object_id = pot.container_id
AND r1.attr_id = 9145923960313063683 /* Wave [d:1|] */ INNER JOIN nc_references r
Plan hash value: 1629775924
----------------------------------------------------------------------------------------------------------------------------------------------
| [d:1|] |Lvl| Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------------------------------------------------------
| 0 | 0|.SELECT STATEMENT | | 1 | | 0 |00:00:43.23 | 441K| 48190 |
| 1 | 1|..NESTED LOOPS | | 1 | 1 | 0 |00:00:43.23 | 441K| 48190 |
| 2 | 2|...NESTED LOOPS | | 1 | 1 | 0 |00:00:43.23 | 441K| 48190 |
| 3 | 3|....NESTED LOOPS | | 1 | 1 | 0 |00:00:43.23 | 441K| 48190 |
| 4 | 4|.....NESTED LOOPS | | 1 | 1 | 0 |00:00:43.23 | 441K| 48190 |
| 5 | 5|......NESTED LOOPS | | 1 | 1 | 0 |00:00:43.23 | 441K| 48190 |
| 6 | 6|.......NESTED LOOPS | | 1 | 1 | 0 |00:00:43.23 | 441K| 48190 |
| 7 | 7|........NESTED LOOPS | | 1 | 1 | 23315 |00:00:26.82 | 355K| 31538 |
| 8 | 8|.........TABLE ACCESS BY INDEX ROWID BATCHED| NC_REFERENCES | 1 | 1 | 80719 |00:00:18.92 | 40952 | 17331 |
|* 9 | 9|..........INDEX RANGE SCAN | XIF01NC_REFERENCES | 1 | 1 | 80719 |00:00:00.18 | 337 | 0 |
|* 10 | 8|.........TABLE ACCESS BY INDEX ROWID | NC_OBJECTS | 80719 | 1 | 23315 |00:00:11.79 | 314K| 14207 |
|* 11 | 9|..........INDEX UNIQUE SCAN | XPKNC_OBJECTS | 80719 | 1 | 80719 |00:00:01.55 | 233K| 813 |
|* 12 | 7|........TABLE ACCESS BY INDEX ROWID BATCHED | NC_PARAMS | 23315 | 1 | 0 |00:00:12.94 | 85919 | 16652 |
|* 13 | 8|.........INDEX RANGE SCAN | XIF12NC_PARAMS | 23315 | 1 | 23315 |00:00:03.40 | 70073 | 4638 |
| 14 | 6|.......TABLE ACCESS BY INDEX ROWID BATCHED | NC_PO_ACTIONS | 0 | 1 | 0 |00:00:00.01 | 0 | 0 |
|* 15 | 7|........INDEX RANGE SCAN | NC_PO_ACTIONS_IX_MANUAL_TASK | 0 | 1 | 0 |00:00:00.01 | 0 | 0 |
| 16 | 5|......TABLE ACCESS BY INDEX ROWID | NC_PO_TASKS | 0 | 1 | 0 |00:00:00.01 | 0 | 0 |
|* 17 | 6|.......INDEX UNIQUE SCAN | XPKNC_PO_TASKS | 0 | 1 | 0 |00:00:00.01 | 0 | 0 |
|* 18 | 4|.....INDEX RANGE SCAN | XIF02NC_REFERENCES | 0 | 1 | 0 |00:00:00.01 | 0 | 0 |
|* 19 | 3|....INDEX RANGE SCAN | XIF02NC_REFERENCES | 0 | 1 | 0 |00:00:00.01 | 0 | 0 |
|* 20 | 2|...INDEX RANGE SCAN | XIF02NC_REFERENCES | 0 | 1 | 0 |00:00:00.01 | 0 | 0 |
----------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation [d:1|]):
---------------------------------------------------
9 - access("SRC"."REFERENCE"=9155224548713314821 AND "SRC"."ATTR_ID"=90100080)
10 - filter(("O"."OBJECT_TYPE_ID"=9146598858613093106 AND "O"."OBJECT_CLASS_ID"=90000330))
11 - access("SRC"."OBJECT_ID"="O"."OBJECT_ID")
12 - filter(("P"."LIST_VALUE_ID"=90100071 OR "P"."LIST_VALUE_ID"=90100072))
13 - access("P"."OBJECT_ID"="O"."OBJECT_ID" AND "P"."ATTR_ID"=90100070)
15 - access("POA"."MANUAL_TASK_ID"="O"."OBJECT_ID")
17 - access("POT"."TASK_ID"="POA"."TASK_ID")
18 - access("R1"."OBJECT_ID"="POT"."CONTAINER_ID" AND "R1"."ATTR_ID"=9145923960313063683)
19 - access("R2"."OBJECT_ID"="POT"."CONTAINER_ID" AND "R2"."ATTR_ID"=9145685312013687931)
20 - access("R2"."REFERENCE"="R3"."OBJECT_ID" AND "R3"."ATTR_ID"=9145065302013613216 AND "R1"."REFERENCE"="R3"."REFERENCE")
Firstly, make sure the statistics are up to date, so the optimiser has the best chance of generating a good execution plan.首先,确保统计数据是最新的,这样优化器就有最好的机会生成一个好的执行计划。 For the scan of XIF01NC_REFERENCES it estimated 1 row but it got 80719, which suggests that it thinks the table is empty or very small when in fact it is not.
对于 XIF01NC_REFERENCES 的扫描,它估计有 1 行,但它得到了 80719,这表明它认为该表是空的或非常小,而实际上它不是。 Nested-loops plans often don't scale well for large volumes, so using these due to a stats underestimate, when a hash join would work better, is a common problem.
嵌套循环计划通常不能很好地适应大容量,因此由于统计数据低估而使用这些计划,而 hash 加入会更好,这是一个常见问题。
What proportion is 80719 of the whole table - is that most of the rows, or only a few of the rows? 80719 占整个表的比例是多少 - 是大部分行,还是只有几行? A nested loop usually works best when it can pick a small subset of the table, while if you need most of the rows a full scan would usually be more efficient.
嵌套循环通常在可以选择表的一小部分时效果最好,而如果您需要大部分行,则完整扫描通常会更有效。
I would also lay out the query neatly.我还会整齐地布置查询。 Opinions vary so it's up to you, but I prefer to put only join predicates in a
join
clause (it makes no difference to the optimiser for inner joins) and move filter predicates to the where
clause.意见各不相同,所以这取决于你,但我更喜欢只将连接谓词放在
join
子句中(这对内部连接的优化器没有影响)并将过滤谓词移动到where
子句。 Also the inner
and outer
keywords are just clutter so I prefer not to use them, and uppercase code belongs in 1970s COBOL, so I get this:此外,
inner
和outer
关键字太混乱了,所以我不想使用它们,大写代码属于 1970 年代的 COBOL,所以我得到了:
select r3.object_id
from nc_references r
join nc_objects o on o.object_id = r.object_id
join nc_params p on p.object_id = o.object_id
join nc_po_actions poa on poa.manual_task_id = o.object_id
join nc_po_tasks pot on pot.task_id = poa.task_id
join nc_references r1 on r1.object_id = pot.container_id
join nc_references r3 on r3.reference = r1.reference
join nc_objects o2 on o2.object_id = r3.object_id
join nc_references r2 on r2.object_id = pot.container_id
and r2.reference = r3.object_id
where r.reference = 9155224548713314821
and r.attr_id = 90100080 --Assigned To
and o.object_type_id = 9146598858613093106
and o.object_class_id = 90000330 --Work Item
and r1.attr_id = 9145923960313063683 --Wave ID
and r2.attr_id = 9145685312013687931 --Product OrderID
and r3.attr_id = 9145065302013613216 --Project Wave
and p.attr_id = 90100070 --State
and p.list_value_id in (90100071, 90100072) --Ready,Active
Now for the hints (assuming updating the stats didn't fix the execution plan).现在的提示(假设更新统计数据没有修复执行计划)。 There are a lot of possible join orders and I don't know which is best so I am not specifying one, but you could do that with the
leading
hint, for example有很多可能的加入顺序,我不知道哪个最好,所以我没有指定一个,但是你可以用
leading
提示来做到这一点,例如
/*+ leading(r o p) */
would tell the optimiser to only consider plans where the join order began r
then o
then p
, and let it decide the rest for itself (though see the note below about hash joins).会告诉优化器只考虑加入顺序开始的计划
r
然后o
然后p
,并让它自己决定 rest (尽管请参阅下面关于 hash 加入的注释)。
To force a full scan, use the full
hint, for example:要强制进行全面扫描,请使用
full
提示,例如:
/*+ full(o) */
To force one or more hash joins, use the use_hash
hint, which can be applied to multiple sources, so要强制一个或多个 hash 连接,请使用
use_hash
提示,它可以应用于多个源,因此
/*+ use_hash(o p) */
is the same as是相同的
/*+ use_hash(o) use_hash(p) */
A hash join consists of "build" and "probe" operations, and the optimiser has the freedom to chose which table to use for each one, even if you specified a join order with leading
, so you may still need to consider no_swap_join_inputs
/ swap_join_inputs
hints if you want a specific order. hash 连接由“构建”和“探测”操作组成,优化器可以自由选择为每个操作使用哪个表,即使您指定了带
leading
的连接顺序,因此您可能仍需要考虑no_swap_join_inputs
/ swap_join_inputs
如果您需要特定订单,请提供提示。 The best description of hinting hash joins that I've read is by Jonathan Lewis: Hash Joins .我读过的关于提示 hash 连接的最佳描述是 Jonathan Lewis: Hash Joins 。
Now for your query, if you want to try full scans and hash joins everywhere instead of indexes and nested loops, without specifying any order, you might try:现在对于你的查询,如果你想尝试完全扫描和 hash 连接而不是索引和嵌套循环,而不指定任何顺序,你可以尝试:
/*+
use_hash(r o p poa pot r1 r3 o2 r2)
full(r) full(o) full(p) full(poa) full(pot) full(r1) full(r3) full(o2) full(r2)
*/
However,然而,
If the stats are wrong then the optimiser might still choose an inefficient join order, so you may need to look at leading
and possibly no_swap_join_inputs
hints.如果统计数据错误,那么优化器可能仍会选择低效的连接顺序,因此您可能需要查看
leading
提示和可能no_swap_join_inputs
提示。
Maybe some combination of full scans and hash joins will work best.也许全扫描和 hash 连接的某种组合效果最好。 With 9 tables there are a lot of possibilities.
9 张桌子有很多可能性。
Hints are great for experimenting, when you suspect the optimiser has missed something, and sometimes in production code when for some reason it's not getting a good plan, but ideally you should fix the issue with your statistics so that the optimiser can do its job without hints.提示对于实验非常有用,当您怀疑优化器遗漏了某些东西时,有时在生产代码中由于某种原因它没有得到一个好的计划,但理想情况下您应该用您的统计信息解决问题,以便优化器可以在没有的情况下完成它的工作提示。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.