简体   繁体   English

Oracle 中的慢速内连接

[英]Slow inner join in Oracle

I have Oracle database with a main table contain 9 000 000 rows and a second with 19 000 000 rows.我有一个 Oracle 数据库,主表包含 9 000 000 行,第二个表包含 19 000 000 行。

When I do :当我做 :

SELECT *
FROM main m
INNER JOIN second s ON m.id = s.fk_id AND s.cd = 'E' AND s.line = 1

It's take 45 seconds to get the first part of the result, even with all the index below :即使使用以下所有索引,也需要 45 秒才能获得结果的第一部分:

CREATE INDEX IDX_1 ON SECOND (LINE, CD, FK_ID, ID);
CREATE INDEX IDX_1 ON SECOND (LINE, CD);
MAIN (ID) AS PRIMARY KEY

Any idea how to do it faster ?知道如何更快地做到这一点吗? I try some index, rebuild but it's always take 45 seconds我尝试了一些索引,重建但总是需要 45 秒

Here is the execution plan :这是执行计划:

------------------------------------------------------------------------------------------------
| Id  | Operation            | Name                 | Rows    | Bytes      | Cost   | Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                      | 8850631 | 2133002071 | 696494 | 00:00:28 |
| * 1 |   HASH JOIN          |                      | 8850631 | 2133002071 | 696494 | 00:00:28 |
| * 2 |    TABLE ACCESS FULL | SECOND               | 8850631 |  646096063 | 143512 | 00:00:06 |
|   3 |    TABLE ACCESS FULL | MAIN                 | 9227624 | 1550240832 | 153363 | 00:00:06 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - access("M"."ID"="S"."FK_ID")
* 2 - filter("S"."CD"='D' AND "S"."LINE"=1)

Thanks谢谢

If you want to see the first line quickly you have to enable Oracle to use the NESTED LOOP join.如果您想快速查看第一行,您必须启用 Oracle 以使用NESTED LOOP连接。

This will required an index on second with the two columns you constraint in your query and an index on main on the join column id这将需要second个索引,其中包含您在查询中约束的两列,以及连接列id上的main索引

create index second_idx on second(line,cd);
create index main_idx on main(id);

You'll see an execution plan similar to one below你会看到一个类似于下面的执行计划

--------------------------------------------------------------------------------------------
| Id  | Operation                     | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |    87 |  8178 |   178   (0)| 00:00:03 |
|   1 |  NESTED LOOPS                 |            |       |       |            |          |
|   2 |   NESTED LOOPS                |            |    87 |  8178 |   178   (0)| 00:00:03 |
|   3 |    TABLE ACCESS BY INDEX ROWID| SECOND     |    87 |  2523 |     4   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | SECOND_IDX |     1 |       |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | MAIN_IDX   |     1 |       |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | MAIN       |     1 |    65 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("S"."LINE"=1 AND "S"."CD"='E')
   5 - access("M"."ID"="S"."FK_ID")

You will access via index all rows in second with requested line and cd (plan line 4 and 3) and for each such row you'll access via index the main table (lines 5 and 6)您将通过索引访问second行请求的linecd (计划第 4 行和第 3 行),对于每个这样的行,您将通过索引访问main表(第 5 行和第 6 行)

This will provide an instant access to the first few rows and will work fine if there are a low number of rows in second table with the selected line and cd.这将提供对前几行的即时访问,并且如果second表中具有选定行和 cd 的行数较少,则可以正常工作。 In other case (when there is a large number of rows with s.cd = 'E' AND s.line = 1 - say 10k+) you will still see the first result rows quickly, but you'll wait ages to see the last row (it will take much more that the 45 seconds to finish the query).在其他情况下(当有大量带有s.cd = 'E' AND s.line = 1 - 比如说 10k+ 的行时)你仍然会很快看到第一个结果行,但你会等待很长时间才能看到最后一个行(完成查询所需的时间远远超过 45 秒)。

If this is a problem you have to use a HASH JOIN (which you probaly do now).如果这是一个问题,您必须使用HASH JOIN (您现在可能会这样做)。

A hash join typically doesn not use indexes and produced following execution plan散列连接通常不使用索引并按照执行计划生成

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        | 10182 |  1153K|   908   (1)| 00:00:11 |
|*  1 |  HASH JOIN         |        | 10182 |  1153K|   908   (1)| 00:00:11 |
|*  2 |   TABLE ACCESS FULL| SECOND | 10182 |    99K|   520   (2)| 00:00:07 |
|   3 |   TABLE ACCESS FULL| MAIN   | 90000 |  9316K|   387   (1)| 00:00:05 |
-----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("M"."ID"="S"."FK_ID")
   2 - filter("S"."LINE"=1 AND "S"."CD"='E')

Summary概括

To use the nested loops the indexes must be available as described above要使用nested loops ,索引必须如上所述可用

The switch between nested loops and hash join is done by the Oracle database (CBO) - provided that your tables statistics and database configuration are fine. nested loopshash join之间的切换由 Oracle 数据库 (CBO) 完成 - 前提是您的表统计信息和数据库配置良好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM