简体   繁体   中英

Oracle query performance issue

I have a table "discussion" of which structure is following:

Table name: discussion,

Id      name       parent_id   root_id  ...other columns
 1   discussion1      0            0
 2   discussion2      1            1
 3   discussion3      2            1
 4   discussion4      3            1
 5   discussion5      4            1

Obviously id was defined as PK, parent_id and root_id has been indexed.

these rows of table was built as a hierarchical relationship(sort of parent->child), As well as please note column root_id is used to describe those rows are in the same tree.

I written two SQLs to get the thread tree:

SQL 1

  SELECT * 
    from discussion 
   start with (parent_Id=0 AND id=?) 
 connect by prior Id=parent_Id

SQL 2

 SELECT * FROM (
           SELECT * FROM discussion WHERE root_id = ? or id = ?
          )START WITH (parent_Id=0 AND id=?) connect by prior Id=parent_Id

After my test, if the data sets are around 4000, SQL 1 did a little bit better than SQL2. But if table discussion has very large data, SQL 2 did much better than SQL1.

When the table has 300 thousands rows, query plan reported the cost of SQL 1 is 22126, The cost of SQL 2 is 6. SQL 1 get involved full table scan or SQL 2 get range index scan.

Is anybody can help me to explain why the two SQLs show different result with different data number set?

More important is I want your guys suggestions to improve the performance or another solution?

Here is query plan for SQL 1 when we have 300 thousands rows.

GENERAL INFORMATION SECTION
-------------------------------------------------------------------------------
Tuning Task Name   : staName52149
Tuning Task Owner  : CISCO
Tuning Task ID     : 54
Workload Type      : Single SQL Statement
Execution Count    : 1
Current Execution  : EXEC_44
Execution Type     : TUNE SQL
Scope              : COMPREHENSIVE
Time Limit(seconds): 1800
Completion Status  : COMPLETED
Started at         : 10/23/2011 23:23:31
Completed at       : 10/23/2011 23:23:59

-------------------------------------------------------------------------------
Schema Name: CISCO
SQL ID     : davhv6p4x6bu2
SQL Text   : SELECT * from discussion start with (parent_Id=0 AND id=6587)
             connect by prior Id=parent_Id

-------------------------------------------------------------------------------
FINDINGS SECTION (1 finding)
-------------------------------------------------------------------------------

1- SQL Profile Finding (see explain plans section below)
--------------------------------------------------------
  A potentially better execution plan was found for this statement.

  Recommendation (estimated benefit: 99.99%)
  ------------------------------------------
  - Consider accepting the recommended SQL profile.
    execute dbms_sqltune.accept_sql_profile(task_name => 'staName52149',
            task_owner => 'CISCO', replace => TRUE);

  Validation results
  ------------------
  The SQL profile was tested by executing both its plan and the original plan
  and measuring their respective execution statistics. A plan may have been
  only partially executed if the other could be run to completion in less time.

                           Original Plan  With SQL Profile  % Improved
                           -------------  ----------------  ----------
  Completion Status:            COMPLETE          COMPLETE
  Elapsed Time(us):            14251990               121      99.99 %
  CPU Time(us):                 3463222                 0        100 %
  User I/O Time(us):           10821745                 0        100 %
  Buffer Gets:                   678361                 6      99.99 %
  Physical Read Requests:          1013                 0        100 %
  Physical Write Requests:         1081                 0        100 %
  Physical Read Bytes:        234168320                 0        100 %
  Physical Write Bytes:       223756288                 0        100 %
  Rows Processed:                     1                 1 
  Fetches:                            1                 1 
  Executions:                         1                 1 

  Notes
  -----
  1. The SQL profile plan was first executed to warm the buffer cache.
  2. Statistics for the SQL profile plan were averaged over next 9 executions.

-------------------------------------------------------------------------------
EXPLAIN PLANS SECTION
-------------------------------------------------------------------------------

1- Original
-----------
Plan hash value: 2811036397


------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |            |   348K|  2770M| 22162  (82)| 00:04:26 |
|*  1 |  CONNECT BY NO FILTERING WITH START-WITH|            |       |       |            |          |
|   2 |   TABLE ACCESS FULL                     | DISCUSSION |   348K|    79M|  4108   (1)| 00:00:50 |
------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$2 / DISCUSSION@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DISCUSSION"."PARENT_ID"=PRIOR NULL)
       filter("PARENT_ID"=0 AND "ID"=6587)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "DISCUSSION"."PARENT_ID"[NUMBER,22], "DISCUSSION"."ID"[NUMBER,22], STRDEF[22], 
       STRDEF[22], STRDEF[22], STRDEF[22], STRDEF[22], STRDEF[4000], STRDEF[22], STRDEF[22], 
       STRDEF[150], STRDEF[7], STRDEF[7], STRDEF[4000], STRDEF[22], STRDEF[22], STRDEF[22], 
       STRDEF[4000], STRDEF[22], STRDEF[4000], STRDEF[7], STRDEF[22], STRDEF[32], STRDEF[22], 
       STRDEF[22], STRDEF[22], STRDEF[7], STRDEF[7], PRIOR NULL[22], LEVEL[4]
   2 - "ID"[NUMBER,22], "DISCUSSION"."CLASS"[NUMBER,22], "DISCUSSION"."SUBCLASS"[NUMBER,22], 
       "PARENT_ID"[NUMBER,22], "DISCUSSION"."ROOT_ID"[NUMBER,22], 
       "DISCUSSION"."MESSAGE"[VARCHAR2,4000], "DISCUSSION"."STATUS"[NUMBER,22], 
       "DISCUSSION"."PRIORITY"[NUMBER,22], "DISCUSSION"."AUTO_NUMBER"[VARCHAR2,150], 
       "DISCUSSION"."CREATE_DATE"[DATE,7], "DISCUSSION"."CLOSE_DATE"[DATE,7], 
       "DISCUSSION"."CLOSE_COMMENTS"[VARCHAR2,4000], "DISCUSSION"."ORGANIZATION"[NUMBER,22], 
       "DISCUSSION"."TYPE"[NUMBER,22], "DISCUSSION"."CREATOR"[NUMBER,22], 
       "DISCUSSION"."SUBJECT"[VARCHAR2,4000], "DISCUSSION"."REPLIES"[NUMBER,22], 
       "DISCUSSION"."NOTIFYLIST"[VARCHAR2,4000], "DISCUSSION"."LAST_REPLY_DATE"[DATE,7], 
       "DISCUSSION"."DELETE_FLAG"[NUMBER,22], "DISCUSSION"."FLAGS"[VARCHAR2,32], 
       "DISCUSSION"."PREVIEW_TEXT"[NUMBER,22], "DISCUSSION"."FILTER"[NUMBER,22], 
       "DISCUSSION"."OBJVERSION"[NUMBER,22], "DISCUSSION"."CREATED"[DATE,7], 
       "DISCUSSION"."LAST_UPD"[DATE,7]

2- Original With Adjusted Cost
------------------------------
Plan hash value: 2811036397


------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |            |   348K|  2770M| 22162  (82)| 00:04:26 |
|*  1 |  CONNECT BY NO FILTERING WITH START-WITH|            |       |       |            |          |
|   2 |   TABLE ACCESS FULL                     | DISCUSSION |   348K|    79M|  4108   (1)| 00:00:50 |
------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$2 / DISCUSSION@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DISCUSSION"."PARENT_ID"=PRIOR NULL)
       filter("PARENT_ID"=0 AND "ID"=6587)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "DISCUSSION"."PARENT_ID"[NUMBER,22], "DISCUSSION"."ID"[NUMBER,22], STRDEF[22], 
       STRDEF[22], STRDEF[22], STRDEF[22], STRDEF[22], STRDEF[4000], STRDEF[22], STRDEF[22], 
       STRDEF[150], STRDEF[7], STRDEF[7], STRDEF[4000], STRDEF[22], STRDEF[22], STRDEF[22], 
       STRDEF[4000], STRDEF[22], STRDEF[4000], STRDEF[7], STRDEF[22], STRDEF[32], STRDEF[22], 
       STRDEF[22], STRDEF[22], STRDEF[7], STRDEF[7], PRIOR NULL[22], LEVEL[4]
   2 - "ID"[NUMBER,22], "DISCUSSION"."CLASS"[NUMBER,22], "DISCUSSION"."SUBCLASS"[NUMBER,22], 
       "PARENT_ID"[NUMBER,22], "DISCUSSION"."ROOT_ID"[NUMBER,22], 
       "DISCUSSION"."MESSAGE"[VARCHAR2,4000], "DISCUSSION"."STATUS"[NUMBER,22], 
       "DISCUSSION"."PRIORITY"[NUMBER,22], "DISCUSSION"."AUTO_NUMBER"[VARCHAR2,150], 
       "DISCUSSION"."CREATE_DATE"[DATE,7], "DISCUSSION"."CLOSE_DATE"[DATE,7], 
       "DISCUSSION"."CLOSE_COMMENTS"[VARCHAR2,4000], "DISCUSSION"."ORGANIZATION"[NUMBER,22], 
       "DISCUSSION"."TYPE"[NUMBER,22], "DISCUSSION"."CREATOR"[NUMBER,22], 
       "DISCUSSION"."SUBJECT"[VARCHAR2,4000], "DISCUSSION"."REPLIES"[NUMBER,22], 
       "DISCUSSION"."NOTIFYLIST"[VARCHAR2,4000], "DISCUSSION"."LAST_REPLY_DATE"[DATE,7], 
       "DISCUSSION"."DELETE_FLAG"[NUMBER,22], "DISCUSSION"."FLAGS"[VARCHAR2,32], 
       "DISCUSSION"."PREVIEW_TEXT"[NUMBER,22], "DISCUSSION"."FILTER"[NUMBER,22], 
       "DISCUSSION"."OBJVERSION"[NUMBER,22], "DISCUSSION"."CREATED"[DATE,7], 
       "DISCUSSION"."LAST_UPD"[DATE,7]

3- Using SQL Profile
--------------------
Plan hash value: 3458076016


---------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                   |     2 | 16686 |    11  (28)| 00:00:01 |
|*  1 |  CONNECT BY WITH FILTERING    |                   |       |       |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID | DISCUSSION        |     1 |   240 |     3   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN          | DISCUSSION_PK     |     1 |       |     2   (0)| 00:00:01 |
|   4 |   NESTED LOOPS                |                   |     1 |   253 |     5   (0)| 00:00:01 |
|   5 |    CONNECT BY PUMP            |                   |       |       |            |          |
|   6 |    TABLE ACCESS BY INDEX ROWID| DISCUSSION        |     1 |   240 |     3   (0)| 00:00:01 |
|*  7 |     INDEX RANGE SCAN          | DISC_IDX_PARENTID |     1 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$4 / DISCUSSION@SEL$4
   3 - SEL$4 / DISCUSSION@SEL$4
   4 - SEL$3
   6 - SEL$3 / DISCUSSION@SEL$3
   7 - SEL$3 / DISCUSSION@SEL$3

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DISCUSSION"."PARENT_ID"=PRIOR NULL)
   2 - filter("PARENT_ID"=0)
   3 - access("ID"=6587)
   7 - access("connect$_by$_pump$_002"."prior Id"="PARENT_ID")

Here is query plan for SQL 2 when we have 300 thousands rows.

GENERAL INFORMATION SECTION
-------------------------------------------------------------------------------
Tuning Task Name   : staName64031
Tuning Task Owner  : CISCO
Tuning Task ID     : 55
Workload Type      : Single SQL Statement
Execution Count    : 1
Current Execution  : EXEC_45
Execution Type     : TUNE SQL
Scope              : COMPREHENSIVE
Time Limit(seconds): 1800
Completion Status  : COMPLETED
Started at         : 10/23/2011 23:30:22
Completed at       : 10/23/2011 23:30:26

-------------------------------------------------------------------------------
Schema Name: CISCO
SQL ID     : c741jfryv5m98
SQL Text   : SELECT * FROM (
                            SELECT * FROM discussion WHERE root_id = 6587 or
             id = 6587
                           )Start With (Parent_Id=0 And Id=6587) Connect By
             Prior Id=Parent_Id

-------------------------------------------------------------------------------
There are no recommendations to improve the statement.

-------------------------------------------------------------------------------
EXPLAIN PLANS SECTION
-------------------------------------------------------------------------------

1- Original
-----------
Plan hash value: 1202872009


-----------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                 |     2 | 16686 |     6  (17)| 00:00:01 |
|*  1 |  CONNECT BY NO FILTERING WITH START-WITH|                 |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID           | DISCUSSION      |     2 |   480 |     5   (0)| 00:00:01 |
|   3 |    BITMAP CONVERSION TO ROWIDS          |                 |       |       |            |          |
|   4 |     BITMAP OR                           |                 |       |       |            |          |
|   5 |      BITMAP CONVERSION FROM ROWIDS      |                 |       |       |            |          |
|*  6 |       INDEX RANGE SCAN                  | DISCUSSION_PK   |       |       |     2   (0)| 00:00:01 |
|   7 |      BITMAP CONVERSION FROM ROWIDS      |                 |       |       |            |          |
|*  8 |       INDEX RANGE SCAN                  | DISC_IDX_ROOTID |       |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1       
   2 - SEL$E029B2FF / DISCUSSION@SEL$5

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DISCUSSION"."PARENT_ID"=PRIOR NULL)
       filter("DISCUSSION"."PARENT_ID"=0 AND "DISCUSSION"."ID"=6587)
   6 - access("ID"=6587)
   8 - access("ROOT_ID"=6587)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "DISCUSSION"."PARENT_ID"[NUMBER,22], "DISCUSSION"."ID"[NUMBER,22], STRDEF[22], STRDEF[22], 
       STRDEF[22], STRDEF[22], STRDEF[22], STRDEF[4000], STRDEF[22], STRDEF[22], STRDEF[150], STRDEF[7], 
       STRDEF[7], STRDEF[4000], STRDEF[22], STRDEF[22], STRDEF[22], STRDEF[4000], STRDEF[22], 
       STRDEF[4000], STRDEF[7], STRDEF[22], STRDEF[32], STRDEF[22], STRDEF[22], STRDEF[22], STRDEF[7], 
       STRDEF[7], PRIOR NULL[22], LEVEL[4]
   2 - "DISCUSSION".ROWID[ROWID,10], "ID"[NUMBER,22], "DISCUSSION"."CLASS"[NUMBER,22], 
       "DISCUSSION"."SUBCLASS"[NUMBER,22], "DISCUSSION"."PARENT_ID"[NUMBER,22], "ROOT_ID"[NUMBER,22], 
       "DISCUSSION"."MESSAGE"[VARCHAR2,4000], "DISCUSSION"."STATUS"[NUMBER,22], 
       "DISCUSSION"."PRIORITY"[NUMBER,22], "DISCUSSION"."AUTO_NUMBER"[VARCHAR2,150], 
       "DISCUSSION"."CREATE_DATE"[DATE,7], "DISCUSSION"."CLOSE_DATE"[DATE,7], 
       "DISCUSSION"."CLOSE_COMMENTS"[VARCHAR2,4000], "DISCUSSION"."ORGANIZATION"[NUMBER,22], 
       "DISCUSSION"."TYPE"[NUMBER,22], "DISCUSSION"."CREATOR"[NUMBER,22], 
       "DISCUSSION"."SUBJECT"[VARCHAR2,4000], "DISCUSSION"."REPLIES"[NUMBER,22], 
       "DISCUSSION"."NOTIFYLIST"[VARCHAR2,4000], "DISCUSSION"."LAST_REPLY_DATE"[DATE,7], 
       "DISCUSSION"."DELETE_FLAG"[NUMBER,22], "DISCUSSION"."FLAGS"[VARCHAR2,32], 
       "DISCUSSION"."PREVIEW_TEXT"[NUMBER,22], "DISCUSSION"."FILTER"[NUMBER,22], 
       "DISCUSSION"."OBJVERSION"[NUMBER,22], "DISCUSSION"."CREATED"[DATE,7], 
       "DISCUSSION"."LAST_UPD"[DATE,7]
   3 - "DISCUSSION".ROWID[ROWID,10]
   4 - STRDEF[BM VAR, 10], STRDEF[BM VAR, 10], STRDEF[BM VAR, 16116]
   5 - STRDEF[BM VAR, 10], STRDEF[BM VAR, 10], STRDEF[BM VAR, 16116]
   6 - "DISCUSSION".ROWID[ROWID,10]
   7 - STRDEF[BM VAR, 10], STRDEF[BM VAR, 10], STRDEF[BM VAR, 16116]
   8 - "DISCUSSION".ROWID[ROWID,10]

-------------------------------------------------------------------------------

Do you just want an explanation of what you are seeing? Or suggestions to improve on this?

Assuming the first:

With SQL1 the database has to find the root element, then all the children, then the children of that and so on. It always has to work with the complete table since no general filter is available. It has process the whole table (or the complete index).

With SQL2 the first thing the database system can do is to reduce the subset of rows it is working on.

The difference is small (or even in the opposite directions) for small tables. If you full table fits in one or two blocks there isn't much useful filtering the database can do. It has to pull the blocks into memory anyway. Even more if it uses index access. But with a huge table with (just making up numbers) 100 blocks reducing these to 5 be consulting 2 index blocks first is a huge gain.

Update: Some ideas to tune this:

  • I'd try to get rid of the or in SQL2. This should be possible when a root entry has it's own id as root_id

  • Assuming you actually need the level, which I think is provided by the hierarchical query, you could precalculate the level into a column of the table. Then you could drop the connect by clause, resulting in a real simple query. Of course if this works, depends a lot on the rest of your application.

  • Do you need all rows fast? Or the first rows? You might try providing the appropriate hint (http://download.oracle.com/docs/cd/B10501_01/server.920/a96533/hintsref.htm#4924)

Basically the first row is looking for candidates to construct the thread tree in the whole "forest" (ie the whole discussion table, ie 300000 rows. Of course it is trying to do its best, but the space of possible solutions is large.

Query 2 basically says, "get only the parts of the tree I am interested in - whcih you can identify thanks to the root id, and build the whole tree from it".

Have a look at the number of rows in the two execution plans and see for yourself the vast difference in the search space.

In case your search space was already small (say, 2 discussions with just 4-5 elements in each) the time for the second query would be dominated by looking for a small number of rows in small table... this explains why the second query may be "worse" for an almost empty table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM