应该如何索引表以优化此Oracle SELECT查询？

Question

I've got the following query in Oracle10g: 我在Oracle10g中有以下查询：

select * 
  from DATA_TABLE DT, 
       LOOKUP_TABLE_A LTA, 
       LOOKUP_TABLE_B LTB
 where DT.COL_A = LTA.COL_A (+) 
   and DT.COL_B = LTA.COL_B (+) 
   and LTA.COL_C = LTB.COL_C
   and LTA.COL_B = LTB.COL_B
   and ( DT.REF_TXT = :refTxt or DT.ALT_REF_TXT = :refTxt )
   and DT.CREATED_DATE between :startDate and :endDate

And was wondering whether you've got any hints for optimising the query. 并且想知道您是否有任何优化查询的提示。

Currently I've got the following indices: 目前，我有以下索引：

IDX1 on DATA_TABLE (REF_TXT, CREATED_DATE)
IDX2 on DATA_TABLE (ALT_REF_TXT, CREATED_DATE)
LOOKUP_A_PK on LOOKUP_TABLE_A (COL_A, COL_B)
LOOKUP_A_IDX1 on LOOKUP_TABLE_A (COL_C, COL_B)
LOOKUP_B_PK on LOOKUP_TABLE_B (COL_C, COL_B)

Note, the LOOKUP tables are very small (<200 rows). 请注意，LOOKUP表非常小（<200行）。

EDIT: 编辑：

Explain plan: 说明计划：

Query Plan
SELECT STATEMENT   Cost = 8
  FILTER
    NESTED LOOPS
      NESTED LOOPS
        TABLE ACCESS BY INDEX ROWID DATA_TABLE
          BITMAP CONVERSION TO ROWIDS
            BITMAP OR
              BITMAP CONVERSION FROM ROWIDS
                SORT ORDER BY
                  INDEX RANGE SCAN IDX1
              BITMAP CONVERSION FROM ROWIDS
                SORT ORDER BY
                  INDEX RANGE SCAN IDX2
        TABLE ACCESS BY INDEX ROWID LOOKUP_TABLE_A
          INDEX UNIQUE SCAN LOOKUP_A_PK
      TABLE ACCESS BY INDEX ROWID LOOKUP_TABLE_B
        INDEX UNIQUE SCAN LOOKUP_B_PK

EDIT2: 编辑2：

The data looks like this: 数据如下所示：

There will be 10000s of distinct REF_TXT, which 10-100s of CREATED_DTs for each. 将有10000个不同的REF_TXT，每个REF_TXT有10-100个CREATED_DT。 ALT_REF_TXT will mostly NULL but there are going to be 100s-1000s which it will be different from REF_TXT. ALT_REF_TXT大多数将为NULL，但将与REF_TXT不同，将为100s-1000s。

EDIT3: Fixed what ALT_REF_TXT actually contains. EDIT3：修复了ALT_REF_TXT实际包含的内容。

Answer 1

The execution plan you're currently getting looks pretty good. 您当前获得的执行计划看起来不错。 There's no obvious improvement to be made. 没有明显的改进。

As other have noted, you have some outer join indicators, but then you essentially prevent the outer join by requiring equality on other columns in the two outer tables. 正如其他人指出的那样，您具有一些外部联接指示器，但是实际上，您需要在两个外部表中的其他列上要求相等，从而实质上防止了外部联接。 As you can see from the execution plan, no outer join is happening. 从执行计划中可以看到，没有外部联接发生。 If you don't want an outer join, remove the (+) operators, they're just confusing the issue. 如果您不希望使用外部联接，请删除(+)运算符，它们只是使问题感到困惑。 If you do want an outer join, rewrite the query as shown by @Dems. 如果确实需要外部联接，请重写查询，如@Dems所示。

If you're unhappy with the current performance, I would suggest running the query with the gather_plan_statistics hint, then using DBMS_XPLAN.DISPLAY_CURSOR(?,?,'ALLSTATS LAST') to view the actual execution statistics. 如果您对当前的性能不满意，建议您运行带有gather_plan_statistics提示的查询，然后使用DBMS_XPLAN.DISPLAY_CURSOR(?,?,'ALLSTATS LAST')查看实际的执行统计信息。 This will show the elapsed time attributed to each step in the execution plan. 这将显示归因于执行计划中每个步骤的经过时间。

You might get some benefit from converting one or both of the lookup tables into index-organized tables. 将一个或两个查找表转换为索引组织的表，可能会得到一些好处。

Answer 2

Your 2 index range scans on IDX1 and IDX2 will produce at most 100 rows, so your BITMAP CONVERSION TO ROWIDS will produce at most 200 rows. 您在IDX1和IDX2上进行的2个索引范围扫描最多将产生100行，因此您的BITMAP CONVERSION TO ROWIDS最多将产生200行。 And from there on, it's only indexed access by rowids, leading to a likely sub-second execution. 从那里开始，它仅由rowid索引访问，这可能导致亚秒级的执行。 So are you really experiencing performance problems? 那么，您真的遇到性能问题吗？ If so, how long does it take exactly? 如果是这样，确切需要多长时间？

If you are experiencing performance problems, then please follow Dave Costa's advice and get the real plan, because in that case it's likely that you are using another plan runtime, possibly due to certain bind variable values or different optimizer environment settings. 如果遇到性能问题，请遵循Dave Costa的建议并获取实际计划，因为在这种情况下，由于某些绑定变量值或不同的优化器环境设置，您可能正在使用另一个计划运行时。

Regards, 问候，
Rob. 抢。

Answer 3

This is one of those cases where it makes very little sense to try to optimize the DBMS performance without knowing what your data means. 这是其中一种情况，在不知道数据含义的情况下尝试优化DBMS性能几乎没有意义。

Do you have many, many distinct CREATED_DATE values and a few rows in your DT for each date? 每个日期的DT中是否有许多不同的CREATED_DATE值和几行？ If so you want an index on CREATED_DATE, as it will be the primary way for the DBMS to reject columns it doesn't want to process. 如果是这样，则您希望在CREATED_DATE创建索引，因为这将是DBMS拒绝它不想处理的列的主要方式。

On the other hand, do you have only a handful of dates, and many distinct values of REF_TXT or ALT_REF_TXT? 另一方面，您是否只有少数几个日期以及REF_TXT或ALT_REF_TXT的许多不同值？ In that case you probably have the correct compound index choices. 在这种情况下，您可能具有正确的复合索引选择。

The presence of OR in your query complicates things greatly, and throws most guesswork out the window. 查询中OR的存在使事情变得非常复杂，并且使大多数猜测工作无法进行。 You must look at EXPLAIN PLAN to see what's going on. 您必须查看EXPLAIN PLAN才能查看发生了什么。

If you have tens of millions of distinct REF_TXT and ALT_REF_TXT values, you may want to consider denormalizing this schema. 如果您有数千万个不同的REF_TXT和ALT_REF_TXT值，则可能需要考虑对该模式进行规范化。

Edit. 编辑。 Thanks for the additional info. 感谢您提供其他信息。 Your explain plan contains no smoking guns that I can see. 您的解释计划中没有我看到的吸烟枪。 Some things to try next if you're not happy with performance yet. 如果您对性能不满意，可以尝试一些下一步。

Flip the order of the columns in your compound indexes on your data tables. 翻转数据表中复合索引中的列顺序。 Maybe that will get you simpler index range scans instead of all the bitmap monkey business. 也许这会使您更简单的索引范围扫描，而不是所有位图猴子业务。

Exchange your SELECT * for the names of the columns you actually need in the query resultset. 将SELECT *交换为查询结果集中实际需要的列的名称。 That's good programming practice in any case, and it MAY allow the optimizer to avoid some work. 无论如何，这都是好的编程习惯，它可以使优化器避免一些工作。

If things are still too slow, try recasting this as a UNION of two queries rather than using OR. 如果情况仍然太慢，请尝试将其重铸为两个查询的UNION，而不要使用OR。 That MAY allow the alt_ref_txt part of your query, which is made a little more complex by all the NULL values in that column, to be optimized separately. 这样可以单独优化查询的alt_ref_txt部分（由于该列中的所有NULL值使查询变得更复杂）。

Answer 4

This may be the query you want using a more upto date syntax. 这可能是您要使用最新语法的查询。

(And without inner joins breaking outer joins) （并且没有内部联接会破坏外部联接）

select
  * 
from
  DATA_TABLE DT
left outer join
  (
    LOOKUP_TABLE_A LTA
  inner join
    LOOKUP_TABLE_B LTB
      on  LTA.COL_C = LTB.COL_C
      and LTA.COL_B = LTB.COL_B
  )
    on  DT.COL_A = LTA.COL_A
    and DT.COL_B = LTA.COL_B
where
   ( DT.REF_TXT = :refTxt or DT.ALT_REF_TXT = :refTxt )
   and DT.CREATED_DATE between :startDate and :endDate

INDEXes that I'd have are... 我要拥有的索引是...

LOOKUP_TABLE_A (COL_A, COL_B)
LOOKUP_TABLE_B (COL_B, COL_C)
DATA_TABLE (REF_TXT, CREATED_DATE)
DATA_TABLE (ALT_REF_TXT, CREATED_DATE)

Note: The first condition in the WHERE clause about contains an OR that will likely frag the use of INDEXes. 注意：WHERE子句中的第一个条件包含一个OR，可能会破坏INDEX的使用。 In such case I have seen performance benefits in UNIONing two queries together... 在这种情况下，我已经看到了UNIONing性能优势的两个查询一起...

  <your query>
where
   DT.REF_TXT = :refTxt
   and DT.CREATED_DATE between :startDate and :endDate

UNION

  <your query>
where
   DT.ALT_REF_TXT = :refTxt
   and DT.CREATED_DATE between :startDate and :endDate

Answer 5

Provide output of this query with "set autot trace". 为该查询的输出提供“设置自动跟踪”。 Let's see how many blocks it is pulling. 让我们看看它拉了多少块。 Explain plan looks good, it should be very fast. 解释计划看起来不错，应该很快。 If you need more, denormalize the lookup table info into DT. 如果您需要更多，请将查询表信息反规范化为DT。 Violates 3rd normal form, but it will make your query faster by eliminating the joins. 违反3rd正常形式，但通过消除联接将使您的查询更快。 In a situation where milliseconds counts, everything is in buffers, and you need that query to run 1000 times/second, it can help by driving down the number of blocks looked at per row. 在毫秒为单位的情况下，所有内容都在缓冲区中，并且您需要该查询以每秒1000次的速度运行，这可以通过减少每行查看的块数来提供帮助。 It is the ultimate way to boost read performance, but complicates your app (and ruins your lovely ER diagram). 这是提高读取性能的最终方法，但会使您的应用程序复杂化（并破坏您可爱的ER图）。

应该如何索引表以优化此Oracle SELECT查询？

问题描述

5 个解决方案

解决方案1
3 2011-09-09 12:07:44

解决方案2
3 2011-09-09 12:53:17

解决方案3
2 已采纳 2011-09-09 11:15:06

解决方案4
2 2011-09-09 11:44:14

解决方案5
0 2011-09-09 17:11:18

应该如何索引表以优化此Oracle SELECT查询？

问题描述

5 个解决方案

解决方案1 3 2011-09-09 12:07:44

解决方案2 3 2011-09-09 12:53:17

解决方案3 2 已采纳 2011-09-09 11:15:06

解决方案4 2 2011-09-09 11:44:14

解决方案5 0 2011-09-09 17:11:18

解决方案1
3 2011-09-09 12:07:44

解决方案2
3 2011-09-09 12:53:17

解决方案3
2 已采纳 2011-09-09 11:15:06

解决方案4
2 2011-09-09 11:44:14

解决方案5
0 2011-09-09 17:11:18