简体   繁体   中英

Using partitioned indexes with partitioned tables

I'm trying to understand the optimal way to construct composite local partitioned indexes for use with partitioned tables.

Here is my example table:

ADDRESS
id
street
city
state
tenant

The Address table is list partitioned upon the tenant column. Pretty much all of the queries will have the tenant column in the query, so there's really no concern for cross-partition searches here.

I want to make a query like select * from address where tenant = 'X' and street = 'Y' and city = 'Z' perform as optimally as possible, in the end. To me, it seems like the right way for that to go would be to first limit to the particular tenant (partition) and then use the local partitioned index.

Now, I believe that only one index can be used per reference table, so I want to make a composite local partitioned index that will be most useful. I envision the composite index having street and city in it. So I have two questions:

  1. Should tenant have an index by itself?

  2. Should the tenant be part of the composite index?

Some understanding behind why it should be on way or another would be helpful as I don't think I fully understand how the partitions work with the partitioned indexes.

create index address_city_street_idx on address(city, street) compress 1 local;

I believe that index is ideal for this query, given a table that is list -partitioned on TENANT:

select * from address where tenant = 'X' and street = 'Y' and city = 'Z' 

To answer questions 1 and 2: Since TENANT is the partition key it should not be in this index, and probably should not be in any index. That column is already used by the partition pruning to select the relevant segment. That work is done at compile or parse time, and is virtually free.

The execution plans in the test case demonstrate that partition pruning is happening. The operation PARTITION LIST SINGLE and the fact that the columns Pstart and Pstop list the number 3, instead of a variable like KEY , show that Oracle has already determined the partition before the query has run. Oracle is instantly discarding irrelevant TENANTs at compile time, there's no need to worry about further reducing the TENANTs at run time with an index.


My index suggestion depends on a few assumptions about the data. Neither CITY nor STREET sound like they would uniquely identify a row for a tenant. And STREET sounds much more selective than CITY. If a single CITY has multiple STREETs then indexing them in that order and using index compression can save a lot of space.

If the index is significantly smaller it may have less levels, which means it would require slightly fewer I/Os for a lookup. And if it's smaller more of it could fit in the buffer cache, which might further improve performance.

But with a table this large, I have a feeling the BLEVEL (number of index levels) will be the same for both, and both indexes will be too large to use cache effectively. Which means there may not be any performance difference between (CITY,STREET) and (STREET,CITY) . But with (CITY,STREET) and compression you may at least be able to save a large amount of space.

Test Case

I assume you cannot simply create both indexes on production and try them out. In that case you'll want to create some tests first.

This test case does not strongly support my suggestion. It is merely a starting point for a more thorough test case. You'll need to create one with a larger amount of data and a more realistic data distribution.

--Create sample table.
create table address
(
    id number,
    street varchar2(100),
    city varchar2(100),
    state varchar2(100),
    tenant varchar2(100)
) partition by list (tenant)
(
    partition p1 values ('tenant1'),
    partition p2 values ('tenant2'),
    partition p3 values ('tenant3'),
    partition p4 values ('tenant4'),
    partition p5 values ('tenant5')
) nologging;

--Insert 5M rows.
--Note the assumptions about the selectivity of the street and city
--are critical to this issue.  Adjust the MOD as necessary.
begin
    for i in 1 .. 5 loop
        insert /*+ append */ into address
        select
            level,
            'Fake Street '||mod(level, 10000),
            'City '||mod(level, 100),
            'State',
            'tenant'||i
        from dual connect by level <= 1000000;
        commit;
    end loop;
end;
/

--Table uses 282MB.
select sum(bytes)/1024/1024 mb from dba_segments where segment_name = 'ADDRESS' and owner = user;

--Create different indexes.
create index address_city_street_idx on address(city, street) compress 1 local;
create index address_street_city_idx on address(street, city) local;

--Gather statistics.
begin
    dbms_stats.gather_table_stats(user, 'ADDRESS');
end;
/

--Check execution plan.
--Oracle by default picks STREET,CITY over CITY,STREET.
--I'm not sure why.  And the cost difference is only 1, so I think things may be different with realistic data.
explain plan for select * from address where tenant = 'tenant3' and street = 'Fake Street 50' and city = 'City 50';
select * from table(dbms_xplan.display);

/*
Plan hash value: 2845844304

--------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                         |     1 |    44 |     4   (0)| 00:00:01 |       |       |
|   1 |  PARTITION LIST SINGLE                     |                         |     1 |    44 |     4   (0)| 00:00:01 |     3 |     3 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| ADDRESS                 |     1 |    44 |     4   (0)| 00:00:01 |     3 |     3 |
|*  3 |    INDEX RANGE SCAN                        | ADDRESS_STREET_CITY_IDX |     1 |       |     3   (0)| 00:00:01 |     3 |     3 |
--------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("STREET"='Fake Street 50' AND "CITY"='City 50')
*/

--Check execution plan of forced CITY,STREET index.
--I don't suggest using a hint in the real query, this is just to compare plans.
explain plan for select /*+ index(address address_city_street_idx) */ * from address where tenant = 'tenant3' and street = 'Fake Street 50' and city = 'City 50';
select * from table(dbms_xplan.display);

/*
Plan hash value: 1084849450

--------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                         |     1 |    44 |     5   (0)| 00:00:01 |       |       |
|   1 |  PARTITION LIST SINGLE                     |                         |     1 |    44 |     5   (0)| 00:00:01 |     3 |     3 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| ADDRESS                 |     1 |    44 |     5   (0)| 00:00:01 |     3 |     3 |
|*  3 |    INDEX RANGE SCAN                        | ADDRESS_CITY_STREET_IDX |     1 |       |     3   (0)| 00:00:01 |     3 |     3 |
--------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CITY"='City 50' AND "STREET"='Fake Street 50')
*/

--Both indexes have BLEVEL=2.
select *
from dba_indexes
where index_name in ('ADDRESS_CITY_STREET_IDX', 'ADDRESS_STREET_CITY_IDX');

--CITY,STREET = 160MB, STREET,CITY=200MB.
--You can see the difference already.  It may get larger with different data distribution.
--And it may get larger with more data, as it may compress better with more repetition.
select segment_name, sum(bytes)/1024/1024 mb
from dba_segments
where segment_name in ('ADDRESS_CITY_STREET_IDX', 'ADDRESS_STREET_CITY_IDX')
group by segment_name;

If index unique then you have to include TENANT to make it local. If it is not unique then do not include it as it will not improve any performance in case of LIST/RANGE partition. You can consider to include it if it is hash partition with many distinct values in one partition.

UPD: However it depends what kind of partitioning you're using - "static" or "dynamic". "Static" is when all partitions are defined once in create table statement and stay unchanged while application is running. "Dynamic" is when application adds/change partitions (like daily process adds daily list partitions for all tables and etc).

So you should avoid global index for "dynamic" partitioning - in this case it will become invalid every time when you add new partition. For "static" option it is ok to use global index if you sometimes need to scan across all partitions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM