简体   繁体   中英

Providing Partition Key to partitioned table, increases query cost

I am working on a web application which queries tables containing large amounts of data. Due to issues with UI performance - I have been investigating ways to improve the performance of long running queries.

Please see below an example of our original code and the explain plan.

 EXPLAIN PLAN FOR
SELECT * FROM T1
INNER JOIN T2 ON (T2.ID = T1.ID)
WHERE
        T1.EMPLOYEE_ID = '1001'
        AND T1.RUN_TIMESTAMP = '16-JAN-19 17.39.36.000000000'

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                              | 37183 |    31M|       | 37654   (1)| 00:00:02 |
|*  1 |  HASH JOIN                           |                              | 37183 |    31M|  6688K| 37654   (1)| 00:00:02 |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1                           | 37183 |  6245K|       |  2492   (1)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | IDX_T1_RT                    | 76305 |       |       |   410   (1)| 00:00:01 |
|   4 |   TABLE ACCESS FULL                  | T2                           |   577K|   399M|       | 14704   (1)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------------

PARTITIONED TABLE

In an attempt to improve query performance - I decided to partition the large table T1 by Value. 45 partitions were created which cover 45 distinct values that each entry in the table is allocated. For the purposes of this example - the values 1-45.

After migrating the data from T1 to the partitioned tabe T1_PART and providing a distinct partition key - I was disapointed to see that despite now only scanning a single partition as expected. The cost benefits were only marginal.

EXPLAIN PLAN FOR
SELECT * FROM T1_PART
INNER JOIN T2 ON (T2.ID = T1.ID)
WHERE
        T1.EMPLOYEE_ID = '1001'
        AND T1.PARTITION_KEY = '1'
        AND T1.RUN_TIMESTAMP = '16-JAN-19 17.39.36.000000000'

----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                              | 19912 |    17M|       | 37341   (1)| 00:00:02 |       |       |
|*  1 |  HASH JOIN                                  |                              | 19912 |    17M|  3680K| 37341   (1)| 00:00:02 |       |       |
|   2 |   PARTITION LIST SINGLE                     |                              | 19912 |  3441K|       |  2131   (1)| 00:00:01 |   KEY |   KEY |
|*  3 |    TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1_PART                      | 19912 |  3441K|       |  2131   (1)| 00:00:01 |    19 |    19 |
|*  4 |     INDEX RANGE SCAN                        | IDX_T1_RT                    | 57355 |       |       |   276   (1)| 00:00:01 |    19 |    19 |
|   5 |   TABLE ACCESS FULL                         | T2                           |   577K|   403M|       | 14706   (1)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------------------------

However, what has really interested me - is that when I no longer specify the partition key - the cost of the query dramatically decreases.

We cans ee from the plan that we now perform a full scan of the table, but now executes using some form of parallisation that I had not anticipated.

EXPLAIN PLAN FOR
SELECT * FROM T1_PART
INNER JOIN T2 ON (T2.ID = T1.ID)
WHERE
        T1.EMPLOYEE_ID = '1001'
        AND T1.RUN_TIMESTAMP = '16-JAN-19 17.39.36.000000000'

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                        | Name                         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                 |                              | 19912 |    17M| 16926   (1)| 00:00:01 |       |       |        |      |            |
|   1 |  PX COORDINATOR                                  |                              |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)                            | :TQ10002                     | 19912 |    17M| 16926   (1)| 00:00:01 |       |       |  Q1,02 | P->S | QC (RAND)  |
|*  3 |    HASH JOIN BUFFERED                            |                              | 19912 |    17M| 16926   (1)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
|   4 |     PX RECEIVE                                   |                              | 19912 |  3519K|  2217   (1)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
|   5 |      PX SEND HYBRID HASH                         | :TQ10000                     | 19912 |  3519K|  2217   (1)| 00:00:01 |       |       |  Q1,00 | P->P | HYBRID HASH|
|   6 |       STATISTICS COLLECTOR                       |                              |       |       |            |          |       |       |  Q1,00 | PCWC |            |
|   7 |        PX PARTITION LIST ALL                     |                              | 19912 |  3519K|  2217   (1)| 00:00:01 |     1 |    45 |  Q1,00 | PCWC |            |
|*  8 |         TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1_PART                      | 19912 |  3519K|  2217   (1)| 00:00:01 |     1 |    45 |  Q1,00 | PCWP |            |
|*  9 |          INDEX RANGE SCAN                        | IDX_T1_RT                    | 57355 |       |   383   (1)| 00:00:01 |     1 |    45 |  Q1,00 | PCWP |            |
|  10 |     PX RECEIVE                                   |                              |   577K|   403M| 14706   (1)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
|  11 |      PX SEND HYBRID HASH                         | :TQ10001                     |   577K|   403M| 14706   (1)| 00:00:01 |       |       |  Q1,01 | S->P | HYBRID HASH|
|  12 |       PX SELECTOR                                |                              |       |       |            |          |       |       |  Q1,01 | SCWC |            |
|  13 |        TABLE ACCESS FULL                         | T2                           |   577K|   403M| 14706   (1)| 00:00:01 |       |       |  Q1,01 | SCWP |            |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Does anyone know what might be going on here? Is there a way to leverage only a single partition scan with the parallel execution of the SQL?

Thanks for the Help! Billy

EDIT:

In this example...

T1 has 625,417 rows

T2 has 577,718 rows

All rows from T1 match the join condition on T2

57,355 rows in T1 match T1.EMPLOYEE_ID = '1001' AND T1.RUN_TIMESTAMP = '16-JAN-19 17.39.36.000000000

T2 has a unique index on ID and is not paritioned

I had a situation where I joined two tables, both tables were partitioned. The query used an index on the first table but performed a full table scan on the second table. I added a hint to use the index on the second table. With that hint, the explain plan showed the query using the index on the second table.
If you have an index on T2.ID, I suggest you add a hint to your query for that index.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM